UDE: A Unified Driving Engine for Human Motion Generation

Zixiang Zhou; Baoyuan Wang

UDE: 人間の運動生成のための統合駆動エンジン

制御可能で編集可能な人間のモーションシーケンスを生成することは、3D アバターの生成における重要な課題です。学習ベースのアプローチが開発され、最近適用されるまで、人間の動きを生成してアニメーション化することは長い間労働集約的でした。ただし、これらのアプローチは依然としてタスク固有またはモダリティ固有です\cite ahuja2019language2poseghosh2021synthesisferreira2021learningli2021ai.この論文では、自然言語または音声シーケンスから人間の動作シーケンスを生成できる最初の統合駆動エンジン「UDE」を提案します (図~図:ティーザーを参照)。具体的には、UDE は次の主要コンポーネントで構成されます。離散潜在コードとして連続的なモーションシーケンスを表す VQVAE に基づくモーション量子化モジュールvan2017neural、2) モダリティを認識した駆動信号を関節空間にマッピングすることを学習するモダリティに依存しない変換エンコーダvaswani2017attention、および 3) 統合トークン変換 (GPT のような radford2019language ) 量子化された潜在コードインデックスを自己回帰的に予測するネットワーク. 4) モーショントークンを入力として受け取り、それらを高多様性のモーションシーケンスにデコードする拡散モーションデコーダ. HumanML3DGuo_2022_CVPR および AIST++li2021learn でこの方法を評価します。ベンチマーク、および実験結果は、私たちの方法が最先端のパフォーマンスを達成することを示しています。 DE/

Generating controllable and editable human motion sequences is a key challenge in 3D Avatar generation. It has been labor-intensive to generate and animate human motion for a long time until learning-based approaches have been developed and applied recently. However, these approaches are still task-specific or modality-specific\cite ahuja2019language2poseghosh2021synthesisferreira2021learningli2021ai. In this paper, we propose ``UDE", the first unified driving engine that enables generating human motion sequences from natural language or audio sequences (see Fig.~fig:teaser). Specifically, UDE consists of the following key components: 1) a motion quantization module based on VQVAE that represents continuous motion sequence as discrete latent codevan2017neural, 2) a modality-agnostic transformer encodervaswani2017attention that learns to map modality-aware driving signals to a joint space, and 3) a unified token transformer (GPT-likeradford2019language) network to predict the quantized latent code index in an auto-regressive manner. 4) a diffusion motion decoder that takes as input the motion tokens and decodes them into motion sequences with high diversity. We evaluate our method on HumanML3DGuo_2022_CVPR and AIST++li2021learn benchmarks, and the experiment results demonstrate our method achieves state-of-the-art performance. Project website: \url{https://github.com/zixiangzhou916/UDE/

updated: Tue Nov 29 2022 08:30:52 GMT+0000 (UTC)

published: Tue Nov 29 2022 08:30:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト