Fast Diffusion Model

Zike Wu; Pan Zhou; Kenji Kawaguchi; Hanwang Zhang

急速普及モデル

実際のデータ合成では成功しているにもかかわらず、拡散モデル (DM) は時間とコストがかかるトレーニングとサンプリングの問題に悩まされることが多く、その広範な用途が制限されます。これを軽減するために、確率的最適化の観点から DM の拡散プロセスを改善し、トレーニングとサンプリングの両方を高速化する高速拡散モデル (FDM) を提案します。具体的には、DMの拡散過程が、確率的時変問題における確率的勾配降下法(SGD)の確率的最適化過程と一致することを初めて発見した。運動量 SGD は現在の勾配と追加の運動量の両方を使用して、より安定して高速な収束を実現することに注意してください。私たちは、トレーニングとサンプリングの両方を加速するために、普及プロセスに勢いを導入したいと考えています。ただし、これには、運動量ベースの拡散プロセスからノイズ摂動カーネルを導出するという課題が伴います。この目的を達成するために、運動量ベースのプロセスを減衰振動システムとして構成します。その臨界減衰状態 (カーネルソリューション) が振動を回避するため、拡散プロセスの収束速度が速くなります。経験的な結果は、私たちの FDM がいくつかの一般的な DM フレームワーク (VP、VE、EDM など) に適用でき、CIFAR-10、FFHQ、AFHQv2 データセット上で同等の画像合成パフォーマンスでトレーニングコストを約 50% 削減できることを示しています。さらに、FDM はサンプリングステップを約 3 分の 1 に減らし、同じ決定論的サンプラーで同様のパフォーマンスを実現します。コードは https://github.com/sail-sg/FDM で入手できます。

Despite their success in real data synthesis, diffusion models (DMs) often suffer from slow and costly training and sampling issues, limiting their broader applications. To mitigate this, we propose a Fast Diffusion Model (FDM) which improves the diffusion process of DMs from a stochastic optimization perspective to speed up both training and sampling. Specifically, we first find that the diffusion process of DMs accords with the stochastic optimization process of stochastic gradient descent (SGD) on a stochastic time-variant problem. Note that momentum SGD uses both the current gradient and an extra momentum, achieving more stable and faster convergence. We are inspired to introduce momentum into the diffusion process to accelerate both training and sampling. However, this comes with the challenge of deriving the noise perturbation kernel from the momentum-based diffusion process. To this end, we frame the momentum-based process as a Damped Oscillation system whose critically damped state -- the kernel solution -- avoids oscillation and thus has a faster convergence speed of the diffusion process. Empirical results show that our FDM can be applied to several popular DM frameworks, e.g. VP, VE, and EDM, and reduces their training cost by about 50% with comparable image synthesis performance on CIFAR-10, FFHQ, and AFHQv2 datasets. Moreover, FDM decreases their sampling steps by about 3× to achieve similar performance under the same deterministic samplers. The code is available at https://github.com/sail-sg/FDM.

updated: Mon Jun 12 2023 09:38:04 GMT+0000 (UTC)

published: Mon Jun 12 2023 09:38:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト