Learning Human Mesh Recovery in 3D Scenes

Zehong Shen; Zhi Cen; Sida Peng; Qing Shuai; Hujun Bao; Xiaowei Zhou

3D シーンでのヒューマンメッシュ回復の学習

我々は、単一の画像が与えられた事前スキャンされたシーン内の人間の絶対的な姿勢と形状を復元するための新しい方法を提案します。シーンに応じたメッシュ最適化を実行するこれまでの方法とは異なり、最初にスパース 3D CNN を使用して絶対位置と密なシーンの接触を推定し、その後、派生した 3D シーンのキューとのクロスアテンションによって事前学習された人間のメッシュ回復ネットワークを強化することを提案します。画像とシーンジオメトリの共同学習により、私たちの方法は奥行きとオクルージョンによって引き起こされる曖昧さを軽減し、より合理的な全体的な姿勢と接触を実現します。ネットワーク内でシーン認識キューをエンコードすることにより、提案された方法を最適化不要にすることも可能になり、リアルタイムアプリケーションの機会が開かれます。実験では、提案されたネットワークが単一の順方向パスで正確かつ物理的に妥当なメッシュを回復でき、精度と速度の両方の点で最先端の方法を上回ることが示されました。

We present a novel method for recovering the absolute pose and shape of a human in a pre-scanned scene given a single image. Unlike previous methods that perform sceneaware mesh optimization, we propose to first estimate absolute position and dense scene contacts with a sparse 3D CNN, and later enhance a pretrained human mesh recovery network by cross-attention with the derived 3D scene cues. Joint learning on images and scene geometry enables our method to reduce the ambiguity caused by depth and occlusion, resulting in more reasonable global postures and contacts. Encoding scene-aware cues in the network also allows the proposed method to be optimization-free, and opens up the opportunity for real-time applications. The experiments show that the proposed network is capable of recovering accurate and physically-plausible meshes by a single forward pass and outperforms state-of-the-art methods in terms of both accuracy and speed.

updated: Tue Jun 06 2023 16:35:45 GMT+0000 (UTC)

published: Tue Jun 06 2023 16:35:45 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト