DAG: Depth-Aware Guidance with Denoising Diffusion Probabilistic Models

Gyeongnyeon Kim; Wooseok Jang; Gyuseong Lee; Susung Hong; Junyoung Seo; Seungryong Kim

DAG: ノイズ除去拡散確率モデルによる深度認識ガイダンス

近年、拡散モデルの成功により、生成モデルは大幅に進歩しました。これらのモデルの成功は、多くの場合、忠実度と多様性の間のトレードオフに効果的なメカニズムを提供する、分類子および分類子を使用しない方法などのガイダンス手法の使用に起因します。しかしながら、これらの方法は、生成された画像がその幾何学的構成、例えば深さを認識できるように誘導することができず、特定レベルの深さ認識を必要とする領域への拡散モデルの適用を妨げる。この制限に対処するために、拡散モデルの豊富な中間表現から導出された推定深度情報を使用する、拡散モデルの新しいガイダンスアプローチを提案します。これを行うために、拡散モデルの内部表現を使用して、ラベル効率の高い深度推定フレームワークを最初に提示します。サンプリングフェーズでは、2 つのガイダンス手法を利用して、推定された深度マップを使用して生成された画像を自己調整します。最初の手法では疑似ラベリングを使用し、次の手法では深度ドメイン拡散事前分布を使用します。実験と広範なアブレーション研究は、拡散モデルを幾何学的にもっともらしい画像生成に導く際の私たちの方法の有効性を示しています。プロジェクトページは https://ku-cvlab.github.io/DAG/ にあります。

In recent years, generative models have undergone significant advancement due to the success of diffusion models. The success of these models is often attributed to their use of guidance techniques, such as classifier and classifier-free methods, which provides effective mechanisms to trade-off between fidelity and diversity. However, these methods are not capable of guiding a generated image to be aware of its geometric configuration, e.g., depth, which hinders the application of diffusion models to areas that require a certain level of depth awareness. To address this limitation, we propose a novel guidance approach for diffusion models that uses estimated depth information derived from the rich intermediate representations of diffusion models. To do this, we first present a label-efficient depth estimation framework using the internal representations of diffusion models. At the sampling phase, we utilize two guidance techniques to self-condition the generated image using the estimated depth map, the first of which uses pseudo-labeling, and the subsequent one uses a depth-domain diffusion prior. Experiments and extensive ablation studies demonstrate the effectiveness of our method in guiding the diffusion models toward geometrically plausible image generation. Project page is available at https://ku-cvlab.github.io/DAG/.

updated: Sat Dec 17 2022 12:47:19 GMT+0000 (UTC)

published: Sat Dec 17 2022 12:47:19 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト