A Closer Look at Few-shot Image Generation

Yunqing Zhao; Henghui Ding; Houjing Huang; Ngai-Man Cheung

少数ショットの画像生成を詳しく見る

現代のGANは、高品質で多様な画像の生成に優れています。ただし、事前にトレーニングされたGANを小さなターゲットデータ（たとえば、10ショット）で転送する場合、ジェネレータはトレーニングサンプルを複製する傾向があります。この数ショットの画像生成タスクに対処するためにいくつかの方法が提案されていますが、統一されたフレームワークの下でそれらを分析するための努力が不足しています。私たちの最初の貢献として、適応中に既存の方法を分析するためのフレームワークを提案します。私たちの分析によると、一部の方法は多様性の維持に不釣り合いに焦点を当てており、品質の向上を妨げていますが、すべての方法は収束後に同様の品質を達成しています。したがって、より良い方法は、多様性の低下を遅らせることができる方法です。さらに、私たちの分析は、多様性の低下をさらに遅らせる余地がまだ十分にあることを明らかにしています。私たちの分析から情報を得て、適応中のターゲットジェネレーターのダイバーシティ劣化を遅らせるために、2番目の貢献は、相互情報量（MI）の最大化を適用して、ターゲットドメインジェネレーターにソースドメインの豊富なマルチレベルダイバーシティ情報を保持することを提案します。コントラスト損失（CL）によるMI最大化を実行し、ジェネレーターとディスクリミネーターを2つの機能エンコーダーとして活用して、CLを計算するためのさまざまなマルチレベル機能を抽出することを提案します。この方法をDualContrastiveLearning（DCL）と呼びます。いくつかの公開データセットでの広範な実験は、適応中にダイバーシティを低下させるジェネレータを遅くする一方で、提案されたDCLは視覚的に快適な品質と最先端の定量的パフォーマンスをもたらすことを示しています。

Modern GANs excel at generating high quality and diverse images. However, when transferring the pretrained GANs on small target data (e.g., 10-shot), the generator tends to replicate the training samples. Several methods have been proposed to address this few-shot image generation task, but there is a lack of effort to analyze them under a unified framework. As our first contribution, we propose a framework to analyze existing methods during the adaptation. Our analysis discovers that while some methods have disproportionate focus on diversity preserving which impede quality improvement, all methods achieve similar quality after convergence. Therefore, the better methods are those that can slow down diversity degradation. Furthermore, our analysis reveals that there is still plenty of room to further slow down diversity degradation. Informed by our analysis and to slow down the diversity degradation of the target generator during adaptation, our second contribution proposes to apply mutual information (MI) maximization to retain the source domain's rich multi-level diversity information in the target domain generator. We propose to perform MI maximization by contrastive loss (CL), leverage the generator and discriminator as two feature encoders to extract different multi-level features for computing CL. We refer to our method as Dual Contrastive Learning (DCL). Extensive experiments on several public datasets show that, while leading to a slower diversity-degrading generator during adaptation, our proposed DCL brings visually pleasant quality and state-of-the-art quantitative performance.

updated: Sun May 08 2022 07:46:26 GMT+0000 (UTC)

published: Sun May 08 2022 07:46:26 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト