Re-imagine the Negative Prompt Algorithm: Transform 2D Diffusion into 3D, alleviate Janus problem and Beyond

Mohammadreza Armandpour; Ali Sadeghian; Huangjie Zheng; Amir Sadeghian; Mingyuan Zhou

ネガティブプロンプトアルゴリズムの再考: 2D 拡散を 3D に変換し、ヤヌス問題を緩和し、その先へ

テキストから画像への拡散モデルは、テキストから画像を生成する際に大きな進歩を遂げましたが、提供されたテキストではなく、モデルがトレーニングされたデータのような画像を生成する傾向がある場合があります。この制限により、2D アプリケーションと 3D アプリケーションの両方での使用が妨げられています。この問題に対処するために、ネガティブプロンプトの使用を検討しましたが、現在の実装では、特にメインプロンプトとネガティブプロンプトが重複している場合に、望ましい結果が得られないことがわかりました。この問題を克服するために、スコア空間の幾何学的特性を活用して現在の否定プロンプトアルゴリズムの欠点に対処する新しいアルゴリズムである Perp-Neg を提案します。 Perp-Neg では、モデルのトレーニングや微調整は必要ありません。さらに、Perp-Neg は、ユーザーが 2D の場合に最初に生成された画像から不要な概念を編集できるようにすることで、画像生成の柔軟性を高めることを実験的に示しています。さらに、Perp-Neg の適用を 3D に拡張するために、Perp-Neg を 2D で使用して拡散モデルを調整し、正規のビューにバイアスをかけるのではなく、目的のビューを生成する方法を徹底的に調査しました。最後に、2D の直感を適用して Perp-Neg を最先端の text-to-3D (DreamFusion) メソッドと統合し、Janus (マルチヘッド) 問題に効果的に対処しました。私たちのプロジェクトページは https://Perp-Neg.github.io/ にあります。

Although text-to-image diffusion models have made significant strides in generating images from text, they are sometimes more inclined to generate images like the data on which the model was trained rather than the provided text. This limitation has hindered their usage in both 2D and 3D applications. To address this problem, we explored the use of negative prompts but found that the current implementation fails to produce desired results, particularly when there is an overlap between the main and negative prompts. To overcome this issue, we propose Perp-Neg, a new algorithm that leverages the geometrical properties of the score space to address the shortcomings of the current negative prompts algorithm. Perp-Neg does not require any training or fine-tuning of the model. Moreover, we experimentally demonstrate that Perp-Neg provides greater flexibility in generating images by enabling users to edit out unwanted concepts from the initially generated images in 2D cases. Furthermore, to extend the application of Perp-Neg to 3D, we conducted a thorough exploration of how Perp-Neg can be used in 2D to condition the diffusion model to generate desired views, rather than being biased toward the canonical views. Finally, we applied our 2D intuition to integrate Perp-Neg with the state-of-the-art text-to-3D (DreamFusion) method, effectively addressing its Janus (multi-head) problem. Our project page is available at https://Perp-Neg.github.io/

updated: Wed Apr 26 2023 13:20:56 GMT+0000 (UTC)

published: Tue Apr 11 2023 04:29:57 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト