Clip Fusion with Bi-level Optimization for Human Mesh Reconstruction from Monocular Videos

doi:10.1145/3581783.3611978

Residential College	false
Status	已發表Published
	Clip Fusion with Bi-level Optimization for Human Mesh Reconstruction from Monocular Videos
	Wu, Peng 1; Lu, Xiankai 1; Shen, Jianbing2 ; Yin, Yilong 1
	2023-10-27
Conference Name	31st ACM International Conference on Multimedia, MM 2023
Source Publication	MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia
Pages	105-115
Conference Date	2023/10/29-2023/11/03
Conference Place	Ottawa
Abstract	Human mesh reconstruction (HMR) from monocular video is the key step to many mixed reality and robotic applications. Although existing methods show promising results by capturing frames' temporal information, these methods predict human mesh with the design of implicit temporal learning modules in a sequence to frame manner. To mine more temporal information from the video, we present a bi-level clip inference network for HMR, which leverages both local motion and global context explicitly for dense 3D reconstruction. Specifically, we propose a novel bi-level temporal fusion strategy that takes both neighboring and long-range relations into consideration. In addition, different from traditional frame-wise operation, we investigate an alternative perspective by treating video-based HMR as clip-wise inference. We evaluate the proposed method on multiple datasets (3DPW, Human3.6M, and MPI-INF-3DHP) quantitatively and qualitatively, demonstrating a significant improvement over existing methods (in terms of PA-MPJPE, ACC-Error etc). Furthermore, we extend the proposed method on more challenging Multiple Shots HMR task to demonstrate its generalizability. Some visual demos can be seen https://github.com/bicf0/bicf-demo.
Keyword	Bi-level Optimization Clip-wise Inference Human Mesh Reconstruction
DOI	10.1145/3581783.3611978
URL	View the original
Language	英語English
Scopus ID	2-s2.0-85179547893
Fulltext Access	View Full-Text via DOI View Full-Text via Web of Science View Full-Text via Scopus
Citation statistics
Document Type	Conference paper
Collection	University of Macau THE STATE KEY LABORATORY OF INTERNET OF THINGS FOR SMART CITY (UNIVERSITY OF MACAU)
Corresponding Author	Lu, Xiankai
Affiliation	1.Shandong University, Jinan, China 2.University of Macao, Macao
Recommended Citation GB/T 7714	Wu, Peng,Lu, Xiankai,Shen, Jianbing,et al. Clip Fusion with Bi-level Optimization for Human Mesh Reconstruction from Monocular Videos[C], 2023, 105-115.
APA	Wu, Peng., Lu, Xiankai., Shen, Jianbing., & Yin, Yilong (2023). Clip Fusion with Bi-level Optimization for Human Mesh Reconstruction from Monocular Videos. MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia, 105-115.

Files in This Item:
There are no files associated with this item.

If you have any objections to this item, please fill out the form below and the administrator will contact you as soon as possible.
Content:
Email：	*
Affiliation No.
Verification Code:	Refresh

Any comments and suggestions are welcomed.
Title:	*
Content:
Email：	*
Verification Code:	Refresh