Refining Generative Process with Discriminator Guidance in Score-based Diffusion Models

Dongjun Kim; Yeongmin Kim; Se Jung Kwon; Wanmo Kang; Il-Chul Moon

スコアベースの拡散モデルにおけるディスクリミネーターガイダンスによる生成プロセスの改良

提案された方法である Discriminator Guidance は、事前にトレーニングされた拡散モデルのサンプル生成を改善することを目的としています。このアプローチは、現実的であるかどうかにかかわらず、ノイズ除去サンプルパスに明示的な監視を与える弁別器を導入します。 GAN とは異なり、私たちのアプローチは、スコアネットワークとディスクリミネーターネットワークの共同トレーニングを必要としません。代わりに、スコアトレーニングの後に弁別器をトレーニングし、弁別器のトレーニングを安定させ、収束を高速にします。サンプル生成では、事前トレーニング済みのスコアに補助項を追加して、識別器を欺きます。この項は、モデルスコアを最適な識別器でのデータスコアに修正します。これは、識別器が補完的な方法でより良いスコア推定に役立つことを意味します。私たちのアルゴリズムを使用して、検証データの FID (1.68) と再現率 (0.66) と同様に、ImageNet 256x256 で FID 1.83 と再現率 0.64 の最先端の結果を達成しました。 https://github.com/alsdudrla10/DG でコードをリリースします。

The proposed method, Discriminator Guidance, aims to improve sample generation of pre-trained diffusion models. The approach introduces a discriminator that gives explicit supervision to a denoising sample path whether it is realistic or not. Unlike GANs, our approach does not require joint training of score and discriminator networks. Instead, we train the discriminator after score training, making discriminator training stable and fast to converge. In sample generation, we add an auxiliary term to the pre-trained score to deceive the discriminator. This term corrects the model score to the data score at the optimal discriminator, which implies that the discriminator helps better score estimation in a complementary way. Using our algorithm, we achive state-of-the-art results on ImageNet 256x256 with FID 1.83 and recall 0.64, similar to the validation data's FID (1.68) and recall (0.66). We release the code at https://github.com/alsdudrla10/DG.

updated: Sun Jun 04 2023 22:19:27 GMT+0000 (UTC)

published: Mon Nov 28 2022 20:04:12 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト