Few-shot Image Generation with Diffusion Models

Jingyuan Zhu; Huimin Ma; Jiansheng Chen; Jian Yuan

拡散モデルによる少数ショット画像の生成

ノイズ除去拡散確率モデル (DDPM) は、大量のデータでトレーニングしたときに、驚くべき多様性を備えた高品質の画像を合成できることが証明されています。ただし、私たちの知る限り、数ショットの画像生成タスクは、DDPM ベースのアプローチでまだ研究されていません。最新のアプローチは、主に Generative Adversarial Networks (GAN) に基づいて構築されており、いくつかの利用可能なサンプルを使用して、大規模なソースドメインで事前トレーニングされたモデルをターゲットドメインに適応させます。このホワイトペーパーでは、トレーニングデータが不足するにつれて、DDPM がオーバーフィットし、深刻な多様性の低下が発生する時期を調査する最初の試みを行います。次に、大規模なソースドメインで事前トレーニングされた DDPM を微調整して、トレーニングデータが限られている場合のオーバーフィッティングの問題を解決します。直接微調整されたモデルは収束を加速し、ゼロからのトレーニングと比較して生成の品質と多様性を向上させますが、それでもいくつかの多様な機能を保持できず、粗い画像しか生成できません。したがって、DDPMペアワイズ適応（DDPM-PA）アプローチを設計して、少数ショットDDPMドメイン適応を最適化します。 DDPM-PA は、適応中に生成されたサンプル間の相対ペアワイズ距離を維持することにより、ソースドメインから学習した情報を効率的に保存します。さらに、DDPM-PA は、ソースモデルと限られたトレーニングデータからの高頻度の詳細の学習を強化します。 DDPM-PA は、世代の品質と多様性をさらに改善し、現在の最先端の GAN ベースのアプローチよりも優れた結果を達成します。一連の少数ショット画像生成タスクに対するアプローチの有効性を定性的および定量的に示します。

Denoising diffusion probabilistic models (DDPMs) have been proven capable of synthesizing high-quality images with remarkable diversity when trained on large amounts of data. However, to our knowledge, few-shot image generation tasks have yet to be studied with DDPM-based approaches. Modern approaches are mainly built on Generative Adversarial Networks (GANs) and adapt models pre-trained on large source domains to target domains using a few available samples. In this paper, we make the first attempt to study when do DDPMs overfit and suffer severe diversity degradation as training data become scarce. Then we fine-tune DDPMs pre-trained on large source domains to solve the overfitting problem when training data is limited. Although the directly fine-tuned models accelerate convergence and improve generation quality and diversity compared with training from scratch, they still fail to retain some diverse features and can only produce coarse images. Therefore, we design a DDPM pairwise adaptation (DDPM-PA) approach to optimize few-shot DDPM domain adaptation. DDPM-PA efficiently preserves information learned from source domains by keeping the relative pairwise distances between generated samples during adaptation. Besides, DDPM-PA enhances the learning of high-frequency details from source models and limited training data. DDPM-PA further improves generation quality and diversity and achieves results better than current state-of-the-art GAN-based approaches. We demonstrate the effectiveness of our approach on a series of few-shot image generation tasks qualitatively and quantitatively.

updated: Tue Mar 07 2023 05:43:56 GMT+0000 (UTC)

published: Mon Nov 07 2022 02:18:27 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト