Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis

Bingchen Liu; Yizhe Zhu; Kunpeng Song; Ahmed Elgammal

忠実度の高い数ショット画像合成のためのより高速で安定したGANトレーニングに向けて

忠実度の高い画像で敵対的生成ネットワーク（GAN）をトレーニングするには、通常、大規模なGPUクラスターと膨大な数のトレーニング画像が必要です。この論文では、最小の計算コストでGANの数ショットの画像合成タスクを研究します。 1024 * 1024の解像度で優れた品質を実現する軽量GAN構造を提案します。特に、モデルは単一のRTX-2080 GPUでわずか数時間のトレーニングでゼロから収束し、100未満のトレーニングサンプルでも一貫したパフォーマンスを発揮します。 2つの手法設計が私たちの仕事を構成します。スキップ層のチャネルごとの励起モジュールと、機能エンコーダーとしてトレーニングされた自己監視型弁別器です。さまざまな画像ドメインをカバーする13のデータセット（データセットとコードはhttps://github.com/odegeasslbc/FastGAN-pytorchで入手可能）を使用して、最先端技術と比較してモデルの優れたパフォーマンスを示しますStyleGAN2、データとコンピューティングの予算が限られている場合。

Training Generative Adversarial Networks (GAN) on high-fidelity images usually requires large-scale GPU-clusters and a vast number of training images. In this paper, we study the few-shot image synthesis task for GAN with minimum computing cost. We propose a light-weight GAN structure that gains superior quality on 1024*1024 resolution. Notably, the model converges from scratch with just a few hours of training on a single RTX-2080 GPU, and has a consistent performance, even with less than 100 training samples. Two technique designs constitute our work, a skip-layer channel-wise excitation module and a self-supervised discriminator trained as a feature-encoder. With thirteen datasets covering a wide variety of image domains (The datasets and code are available at: https://github.com/odegeasslbc/FastGAN-pytorch), we show our model's superior performance compared to the state-of-the-art StyleGAN2, when data and computing budget are limited.

updated: Tue Jan 12 2021 22:02:54 GMT+0000 (UTC)

published: Tue Jan 12 2021 22:02:54 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト