Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures

Gal Metzer; Elad Richardson; Or Patashnik; Raja Giryes; Daniel Cohen-Or

3D形状とテクスチャの形状誘導生成のためのLatent-NeRF

テキストガイドによる画像生成は近年急速に進歩しており、テキストガイドによる形状生成に大きなブレークスルーをもたらしています。最近、スコア蒸留を使用して、NeRF モデルをテキストガイドして 3D オブジェクトを生成できることが示されました。スコア蒸留を、事前に訓練されたオートエンコーダーのコンパクトな潜在空間で拡散プロセス全体を適用する、公開されている計算効率の高い潜在拡散モデルに適応させます。 NeRF は画像空間で動作するため、潜在スコア抽出でそれらを誘導する単純なソリューションでは、各誘導ステップで潜在空間へのエンコードが必要になります。代わりに、NeRF を潜在空間に持ち込んで、Latent-NeRF にすることを提案します。潜在的な NeRF を分析すると、テキストから 3D へのモデルは印象的な結果を生成できますが、本質的に制約がなく、特定の 3D 構造をガイドまたは強制する機能が不足している可能性があることがわかります。 3D 生成を支援および指示するために、希望するオブジェクトの粗い構造を定義する抽象的なジオメトリである Sketch-Shape を使用して、Latent-NeRF をガイドすることを提案します。次に、そのような制約を潜在的 NeRF に直接統合する手段を提示します。このテキストと形状ガイダンスの独自の組み合わせにより、生成プロセスをより詳細に制御できます。また、潜在スコア蒸留を 3D メッシュに直接適用できることも示しています。これにより、特定のジオメトリに高品質のテクスチャを生成できます。私たちの実験は、さまざまな形式のガイダンスの力と、潜在的なレンダリングを使用する効率を検証します。実装は https://github.com/eladrich/latent-nerf で入手できます

Text-guided image generation has progressed rapidly in recent years, inspiring major breakthroughs in text-guided shape generation. Recently, it has been shown that using score distillation, one can successfully text-guide a NeRF model to generate a 3D object. We adapt the score distillation to the publicly available, and computationally efficient, Latent Diffusion Models, which apply the entire diffusion process in a compact latent space of a pretrained autoencoder. As NeRFs operate in image space, a naive solution for guiding them with latent score distillation would require encoding to the latent space at each guidance step. Instead, we propose to bring the NeRF to the latent space, resulting in a Latent-NeRF. Analyzing our Latent-NeRF, we show that while Text-to-3D models can generate impressive results, they are inherently unconstrained and may lack the ability to guide or enforce a specific 3D structure. To assist and direct the 3D generation, we propose to guide our Latent-NeRF using a Sketch-Shape: an abstract geometry that defines the coarse structure of the desired object. Then, we present means to integrate such a constraint directly into a Latent-NeRF. This unique combination of text and shape guidance allows for increased control over the generation process. We also show that latent score distillation can be successfully applied directly on 3D meshes. This allows for generating high-quality textures on a given geometry. Our experiments validate the power of our different forms of guidance and the efficiency of using latent rendering. Implementation is available at https://github.com/eladrich/latent-nerf

updated: Mon Nov 14 2022 18:25:24 GMT+0000 (UTC)

published: Mon Nov 14 2022 18:25:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト