Rodin: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion

Tengfei Wang; Bo Zhang; Ting Zhang; Shuyang Gu; Jianmin Bao; Tadas Baltrusaitis; Jingjing Shen; Dong Chen; Fang Wen; Qifeng Chen; Baining Guo

Rodin: 拡散を使用して 3D デジタルアバターをスカルプトするための生成モデル

このホワイトペーパーでは、拡散モデルを使用して、ニューラルラディアンスフィールドとして表される 3D デジタルアバターを自動的に生成する 3D 生成モデルを紹介します。このようなアバターを生成する際の大きな課題は、高品質のアバターに必要な豊富な詳細を生成するには、3D でのメモリと処理のコストが法外に高いことです。この問題に取り組むために、ロールアウト拡散ネットワーク (Rodin) を提案します。これは、ニューラルラディアンスフィールドを複数の 2D フィーチャマップとして表し、これらのマップを 3D 認識拡散を実行する単一の 2D フィーチャプレーンに展開します。ロダンモデルは、3D での元の関係に従って 2D フィーチャプレーンに投影されたフィーチャに対応する 3D 対応畳み込みを使用して、3D での拡散の完全性を維持しながら、非常に必要な計算効率をもたらします。また、潜在的条件付けを使用して、グローバルな一貫性のために特徴生成を調整し、忠実度の高いアバターに導き、テキストプロンプトに基づいてセマンティック編集を可能にします。最後に、階層合成を使用して詳細をさらに強化します。私たちのモデルによって生成された 3D アバターは、既存の生成技術によって生成されたものよりも優れています。リアルなヘアスタイルとヒゲのような顔の毛で、非常に詳細なアバターを生成できます。また、画像やテキストからの 3D アバターの生成、およびテキストガイドによる編集機能も示します。

This paper presents a 3D generative model that uses diffusion models to automatically generate 3D digital avatars represented as neural radiance fields. A significant challenge in generating such avatars is that the memory and processing costs in 3D are prohibitive for producing the rich details required for high-quality avatars. To tackle this problem we propose the roll-out diffusion network (Rodin), which represents a neural radiance field as multiple 2D feature maps and rolls out these maps into a single 2D feature plane within which we perform 3D-aware diffusion. The Rodin model brings the much-needed computational efficiency while preserving the integrity of diffusion in 3D by using 3D-aware convolution that attends to projected features in the 2D feature plane according to their original relationship in 3D. We also use latent conditioning to orchestrate the feature generation for global coherence, leading to high-fidelity avatars and enabling their semantic editing based on text prompts. Finally, we use hierarchical synthesis to further enhance details. The 3D avatars generated by our model compare favorably with those produced by existing generative techniques. We can generate highly detailed avatars with realistic hairstyles and facial hair like beards. We also demonstrate 3D avatar generation from image or text as well as text-guided editability.

updated: Mon Dec 12 2022 18:59:40 GMT+0000 (UTC)

published: Mon Dec 12 2022 18:59:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト