Driving-Signal Aware Full-Body Avatars

Timur Bagautdinov; Chenglei Wu; Tomas Simon; Fabian Prada; Takaaki Shiratori; Shih-En Wei; Weipeng Xu; Yaser Sheikh; Jason Saragih

運転信号を意識した全身アバター

運転信号を意識した全身アバターを構築するための学習ベースの方法を紹介します。私たちのモデルは、人間のポーズや顔のキーポイントなどの不完全な運転信号でアニメーション化できる条件付き変分オートエンコーダであり、人間のジオメトリとビューに依存する外観の高品質な表現を生成します。私たちの方法の背後にある中心的な直感は、アニメーション中には利用できない駆動信号と残りの生成要因を解きほぐすことによって、より良い運転性と一般化を達成できるということです。この目的のために、残りの情報を排他的にキャプチャする潜在空間を導入することにより、運転信号の情報不足を明示的に説明します。これにより、運転信号に忠実でありながら、全身アニメーションで必要な欠落要素の代入が可能になります。また、より良い一般化を促進し、実際のデータセットでよく見られるグローバルな偶然相関の影響を最小限に抑えるのに役立つ、駆動信号の学習可能な局所圧縮を提案します。与えられた駆動信号に対して、結果として得られる変分モデルは、特定のアプリケーションに最適な代入戦略を可能にする、欠落している要因の不確実性のコンパクト空間を生成します。環境に配置され、VRヘッドセットに取り付けられた最小限のセンサーから取得された駆動信号を使用して、仮想テレプレゼンスの全身アニメーションの困難な問題に対するアプローチの有効性を示します。

We present a learning-based method for building driving-signal aware full-body avatars. Our model is a conditional variational autoencoder that can be animated with incomplete driving signals, such as human pose and facial keypoints, and produces a high-quality representation of human geometry and view-dependent appearance. The core intuition behind our method is that better drivability and generalization can be achieved by disentangling the driving signals and remaining generative factors, which are not available during animation. To this end, we explicitly account for information deficiency in the driving signal by introducing a latent space that exclusively captures the remaining information, thus enabling the imputation of the missing factors required during full-body animation, while remaining faithful to the driving signal. We also propose a learnable localized compression for the driving signal which promotes better generalization, and helps minimize the influence of global chance-correlations often found in real datasets. For a given driving signal, the resulting variational model produces a compact space of uncertainty for missing factors that allows for an imputation strategy best suited to a particular application. We demonstrate the efficacy of our approach on the challenging problem of full-body animation for virtual telepresence with driving signals acquired from minimal sensors placed in the environment and mounted on a VR-headset.

updated: Fri Jun 25 2021 18:30:39 GMT+0000 (UTC)

published: Fri May 21 2021 16:22:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト