Interpreting and Improving Diffusion Models Using the Euclidean Distance Function

Frank Permenter; Chenyang Yuan

ユークリッド距離関数を使用した拡散モデルの解釈と改善

ノイズ除去は直観的に投影に関連しています。実際、多様体仮説の下では、ランダムノイズの追加は直交摂動とほぼ同等です。したがって、ノイズを除去することを学ぶことは、投影することを学ぶこととほぼ同じです。この論文では、この観察を使用して、ノイズ除去拡散モデルをユークリッド距離関数に適用される近似勾配降下法として再解釈します。次に、デノイザーの投影誤差に関する単純な仮定の下で、DDIM サンプラーの直接的な収束分析を提供します。最後に、理論的結果からの洞察を使用して、DDIM に対する 2 つの簡単な変更に基づいた新しいサンプラーを提案します。当社のサンプラーは、わずか 5 ～ 10 回の関数評価で、事前トレーニングされた CIFAR-10 および CelebA モデルで最先端の FID スコアを達成し、潜在拡散モデルで高品質のサンプルを生成できます。

Denoising is intuitively related to projection. Indeed, under the manifold hypothesis, adding random noise is approximately equivalent to orthogonal perturbation. Hence, learning to denoise is approximately learning to project. In this paper, we use this observation to reinterpret denoising diffusion models as approximate gradient descent applied to the Euclidean distance function. We then provide straight-forward convergence analysis of the DDIM sampler under simple assumptions on the projection-error of the denoiser. Finally, we propose a new sampler based on two simple modifications to DDIM using insights from our theoretical results. In as few as 5-10 function evaluations, our sampler achieves state-of-the-art FID scores on pretrained CIFAR-10 and CelebA models and can generate high quality samples on latent diffusion models.

updated: Tue Feb 13 2024 16:08:41 GMT+0000 (UTC)

published: Thu Jun 08 2023 00:56:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト