A Closer Look at Few-shot Image Generation

Yunqing Zhao; Henghui Ding; Houjing Huang; Ngai-Man Cheung

少数ショット画像生成の詳細

最新の GAN は、高品質で多様な画像の生成に優れています。ただし、小さなターゲットデータ (たとえば、10 ショット) で事前トレーニング済みの GAN を転送する場合、ジェネレーターはトレーニングサンプルを複製する傾向があります。この少数ショットの画像生成タスクに対処するためにいくつかの方法が提案されていますが、統一されたフレームワークの下でそれらを分析する努力が不足しています。最初の貢献として、適応中に既存の方法を分析するためのフレームワークを提案します。私たちの分析は、いくつかの方法が品質の改善を妨げる多様性の保存に不均衡に焦点を当てている一方で、すべての方法が収束後に同様の品質を達成することを発見しました。したがって、より良い方法は、多様性の低下を遅らせることができる方法です。さらに、私たちの分析は、多様性の劣化をさらに遅らせる余地がまだ十分にあることを明らかにしています。分析によって通知され、適応中のターゲットジェネレーターの多様性の低下を遅らせるために、2番目の貢献は、相互情報量（MI）の最大化を適用して、ソースドメインの豊富なマルチレベルダイバーシティ情報をターゲットドメインジェネレーターに保持することを提案します。コントラスト損失 (CL) による MI 最大化を実行し、ジェネレーターとディスクリミネーターを 2 つの機能エンコーダーとして活用して、CL を計算するための異なるマルチレベル機能を抽出することを提案します。この方法を Dual Contrastive Learning (DCL) と呼びます。いくつかの公開データセットでの広範な実験は、適応中に多様性を低下させるジェネレーターを遅くする一方で、提案された DCL が視覚的に快適な品質と最先端の定量的パフォーマンスをもたらすことを示しています。プロジェクトページ: yunqing-me.github.io/A-Closer-Look-at-FSIG。

Modern GANs excel at generating high quality and diverse images. However, when transferring the pretrained GANs on small target data (e.g., 10-shot), the generator tends to replicate the training samples. Several methods have been proposed to address this few-shot image generation task, but there is a lack of effort to analyze them under a unified framework. As our first contribution, we propose a framework to analyze existing methods during the adaptation. Our analysis discovers that while some methods have disproportionate focus on diversity preserving which impede quality improvement, all methods achieve similar quality after convergence. Therefore, the better methods are those that can slow down diversity degradation. Furthermore, our analysis reveals that there is still plenty of room to further slow down diversity degradation. Informed by our analysis and to slow down the diversity degradation of the target generator during adaptation, our second contribution proposes to apply mutual information (MI) maximization to retain the source domain's rich multi-level diversity information in the target domain generator. We propose to perform MI maximization by contrastive loss (CL), leverage the generator and discriminator as two feature encoders to extract different multi-level features for computing CL. We refer to our method as Dual Contrastive Learning (DCL). Extensive experiments on several public datasets show that, while leading to a slower diversity-degrading generator during adaptation, our proposed DCL brings visually pleasant quality and state-of-the-art quantitative performance. Project Page: yunqing-me.github.io/A-Closer-Look-at-FSIG.

updated: Sat Apr 15 2023 14:51:43 GMT+0000 (UTC)

published: Sun May 08 2022 07:46:26 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト