HSPACE: Synthetic Parametric Humans Animated in Complex Environments

Eduard Gabriel Bazavan; Andrei Zanfir; Mihai Zanfir; William T. Freeman; Rahul Sukthankar; Cristian Sminchisescu

HSPACE：複雑な環境でアニメーション化された合成パラメトリック人間

3Dヒューマンセンシングの最先端技術の進歩は、現在、複数の人が動いていて、現実世界の環境で動作し、複雑な照明やオクルージョンがあり、潜在的に動くカメラ。洗練されたシーンの理解には、人間のポーズと形状、およびジェスチャーを推定して、最終的に有用なメトリックと行動信号を自由な視点のフォトリアリスティックな視覚化機能と組み合わせる表現に向けて行う必要があります。進歩を維持するために、複雑な合成屋内および屋外環境に配置されたアニメーション化された人間の大規模なフォトリアリスティックデータセットHuman-SPACE（HSPACE）を構築します。さまざまな年齢、性別、比率、民族の100人の多様な個人を、数百の動きやシーン、および体型のパラメータの変化（合計1,600人の異なる人間）と組み合わせて、次の初期データセットを生成します。 100万フレーム以上。人間のアニメーションは、表現力豊かな人体モデルGHUMを人の単一スキャンに適合させた後、服を着た人間のリアルなアニメーション、体の比率の統計的変動、および共同で一貫したシーン配置をサポートする新しいリターゲティングおよび位置決め手順によって取得されます。複数の動く人。アセットは大規模に自動的に生成され、既存のリアルタイムレンダリングおよびゲームエンジンと互換性があります。評価サーバーを備えたデータセットは、研究に利用できるようになります。実際のデータと弱い監視に関連する合成データの影響の大規模な分析は、モデル容量の増加に関連して、この実用的な設定で、継続的な品質改善とシミュレーションと実際のギャップの制限のかなりの可能性を強調しています。

Advances in the state of the art for 3d human sensing are currently limited by the lack of visual datasets with 3d ground truth, including multiple people, in motion, operating in real-world environments, with complex illumination or occlusion, and potentially observed by a moving camera. Sophisticated scene understanding would require estimating human pose and shape as well as gestures, towards representations that ultimately combine useful metric and behavioral signals with free-viewpoint photo-realistic visualisation capabilities. To sustain progress, we build a large-scale photo-realistic dataset, Human-SPACE (HSPACE), of animated humans placed in complex synthetic indoor and outdoor environments. We combine a hundred diverse individuals of varying ages, gender, proportions, and ethnicity, with hundreds of motions and scenes, as well as parametric variations in body shape (for a total of 1,600 different humans), in order to generate an initial dataset of over 1 million frames. Human animations are obtained by fitting an expressive human body model, GHUM, to single scans of people, followed by novel re-targeting and positioning procedures that support the realistic animation of dressed humans, statistical variation of body proportions, and jointly consistent scene placement of multiple moving people. Assets are generated automatically, at scale, and are compatible with existing real time rendering and game engines. The dataset with evaluation server will be made available for research. Our large-scale analysis of the impact of synthetic data, in connection with real data and weak supervision, underlines the considerable potential for continuing quality improvements and limiting the sim-to-real gap, in this practical setting, in connection with increased model capacity.

updated: Thu Dec 23 2021 22:27:55 GMT+0000 (UTC)

published: Thu Dec 23 2021 22:27:55 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト