Local 3D Editing via 3D Distillation of CLIP Knowledge

Junha Hyung; Sungwon Hwang; Daejin Kim; Hyunji Lee; Jaegul Choo

CLIP 知識の 3D 蒸留によるローカル 3D 編集

3D コンテンツの操作は、多くの実世界のアプリケーション (製品デザイン、漫画の生成、3D アバターの編集など) で重要なコンピュータビジョンタスクです。最近提案された 3D GAN は、Neural Radiance field (NeRF) を使用して、フォトリアリスティックな 3D 対応の多様なコンテンツを生成できます。ただし、NeRF の操作は、操作後に視覚的な品質が低下する傾向があり、操作には 2D セマンティックマップなどの次善の制御ハンドルが使用されるため、依然として困難な問題が残っています。テキストガイドによる操作は 3D 編集における可能性を示していますが、そのようなアプローチには局所性が欠けていることがよくあります。これらの問題を克服するために、私たちはローカル編集 NeRF (LENeRF) を提案します。これは、きめ細かく局所的な操作を行うためにテキスト入力のみを必要とします。具体的には、潜在残差マッパー、アテンションフィールドネットワーク、および変形ネットワークという LENeRF の 3 つのアドオンモジュールを紹介します。これらは、3D アテンションフィールドを推定することによって 3D フィーチャのローカル操作に共同で使用されます。 3D アテンションフィールドは、CLIP のゼロショットマスク生成機能をマルチビューガイダンスで 3D 空間に抽出することにより、教師なしの方法で学習されます。多様な実験を行い、定量・定性の両面から徹底した評価を行います。

3D content manipulation is an important computer vision task with many real-world applications (e.g., product design, cartoon generation, and 3D Avatar editing). Recently proposed 3D GANs can generate diverse photorealistic 3D-aware contents using Neural Radiance fields (NeRF). However, manipulation of NeRF still remains a challenging problem since the visual quality tends to degrade after manipulation and suboptimal control handles such as 2D semantic maps are used for manipulations. While text-guided manipulations have shown potential in 3D editing, such approaches often lack locality. To overcome these problems, we propose Local Editing NeRF (LENeRF), which only requires text inputs for fine-grained and localized manipulation. Specifically, we present three add-on modules of LENeRF, the Latent Residual Mapper, the Attention Field Network, and the Deformation Network, which are jointly used for local manipulations of 3D features by estimating a 3D attention field. The 3D attention field is learned in an unsupervised way, by distilling the zero-shot mask generation capability of CLIP to the 3D space with multi-view guidance. We conduct diverse experiments and thorough evaluations both quantitatively and qualitatively.

updated: Wed Jun 21 2023 21:09:45 GMT+0000 (UTC)

published: Wed Jun 21 2023 21:09:45 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト