Fast Sampling of Diffusion Models via Operator Learning

Hongkai Zheng; Weili Nie; Arash Vahdat; Kamyar Azizzadenesheli; Anima Anandkumar

オペレーター学習による拡散モデルの高速サンプリング

拡散モデルは、さまざまな分野で広く採用されています。ただし、微分方程式によって定義される連続プロセスをエミュレートするには、数百から数千のネットワーク評価が必要になるため、サンプリングプロセスは低速です。この作業では、確率フロー微分方程式を解くための効率的な方法であるニューラル演算子を使用して、拡散モデルのサンプリングプロセスを加速します。逐次的な性質を持つ他の高速サンプリング手法と比較して、1 つのモデルフォワードパスのみで画像を生成する並列復号化手法を提案しました。初期条件、すなわちガウス分布を逆拡散プロセスの連続時間解軌跡にマッピングするニューラルオペレーター (DSNO) を使用した拡散モデルサンプリングを提案します。軌跡に沿った時間相関をモデル化するために、フーリエ空間でパラメーター化された時間畳み込み層を特定の拡散モデルバックボーンに導入します。私たちの方法が、1モデル評価設定で、CIFAR-10では4.12、ImageNet-64では8.35という最先端のFIDを達成することを示します。

Diffusion models have found widespread adoption in various areas. However, their sampling process is slow because it requires hundreds to thousands of network evaluations to emulate a continuous process defined by differential equations. In this work, we use neural operators, an efficient method to solve the probability flow differential equations, to accelerate the sampling process of diffusion models. Compared to other fast sampling methods that have a sequential nature, we are the first to propose parallel decoding method that generates images with only one model forward pass. We propose diffusion model sampling with neural operator (DSNO) that maps the initial condition, i.e., Gaussian distribution, to the continuous-time solution trajectory of the reverse diffusion process. To model the temporal correlations along the trajectory, we introduce temporal convolution layers that are parameterized in the Fourier space into the given diffusion model backbone. We show our method achieves state-of-the-art FID of 4.12 for CIFAR-10 and 8.35 for ImageNet-64 in the one-model-evaluation setting.

updated: Tue Jan 31 2023 22:45:41 GMT+0000 (UTC)

published: Thu Nov 24 2022 07:30:27 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト