4D Human Body Capture from Egocentric Video via 3D Scene Grounding

Miao Liu; Dexin Yang; Yan Zhang; Zhaopeng Cui; James M. Rehg; Siyu Tang

3Dシーングラウンディングを介した自己中心的なビデオからの4D人体キャプチャ

単眼の自己中心的なビデオから二人称3D人体メッシュの時系列を再構築する新しいタスクを紹介します。自己中心的なビデオのユニークな視点と迅速に具現化されたカメラの動きは、人体のキャプチャに対する追加の技術的障壁を高めます。これらの課題に対処するために、ビデオシーケンス全体の2D観測と人間とシーンの相互作用の制約を活用して、3D環境に基づいた二人称の人間のポーズ、形状、グローバルモーションを推定する、シンプルで効果的な最適化ベースのアプローチを提案します。エゴセントリックビューからキャプチャされます。設計の選択を検証するために、詳細なアブレーション研究を実施します。さらに、私たちの方法を単眼ビデオからの人間のモーションキャプチャに関する以前の最先端の方法と比較し、私たちの方法が挑戦的な自己中心的な設定の下でより正確な人体のポーズと形状を推定することを示します。さらに、私たちのアプローチがより現実的な人間とシーンの相互作用を生み出すことを示します。

We introduce a novel task of reconstructing a time series of second-person 3D human body meshes from monocular egocentric videos. The unique viewpoint and rapid embodied camera motion of egocentric videos raise additional technical barriers for human body capture. To address those challenges, we propose a simple yet effective optimization-based approach that leverages 2D observations of the entire video sequence and human-scene interaction constraint to estimate second-person human poses, shapes, and global motion that are grounded on the 3D environment captured from the egocentric view. We conduct detailed ablation studies to validate our design choice. Moreover, we compare our method with the previous state-of-the-art method on human motion capture from monocular video, and show that our method estimates more accurate human-body poses and shapes under the challenging egocentric setting. In addition, we demonstrate that our approach produces more realistic human-scene interaction.

updated: Fri Oct 15 2021 23:03:13 GMT+0000 (UTC)

published: Thu Nov 26 2020 15:17:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト