Alleviating Mode Collapse in GAN via Diversity Penalty Module

Sen Pei; Richard Yi Da Xu; Shiming Xiang; Gaofeng Meng

ダイバーシティペナルティモジュールによるGANのモード崩壊の緩和

バニラGAN（Goodfellow etal。2014）は、モードの崩壊に深く悩まされています。これは通常、対応する潜在ベクトルが非常に異なっていても、ジェネレーターによって生成された画像がそれらの間で高い類似性を持つ傾向があることとして現れます。この論文では、GANのモード崩壊を緩和するために、プラグ可能なダイバーシティペナルティモジュール（DPM）を紹介します。これにより、特徴空間内の画像ペアの類似性が低下します。つまり、2つの潜在ベクトルが異なる場合、ジェネレータを強制して、特徴が異なる2つの画像を生成します。正規化されたグラム行列は、類似性を測定するために使用されます。提案された方法を、Unrolled GAN（Metz etal。2016）、BourGAN（Xiao、Zhong、およびZheng 2018）、PacGAN（Lin etal。2018）、VEEGAN（Srivastava etal。2017）、およびALI（Dumoulin etal。。2016）2D合成データセットで、結果は、ダイバーシティペナルティモジュールがGANがデータ分散のはるかに多くのモードをキャプチャするのに役立つことを示しています。さらに、分類タスクでは、この方法をMNIST、Fashion-MNIST、CIFAR-10の画像データ拡張として適用し、分類テストの精度はWGAN GPと比較して0.24％、1.34％、0.52％向上しています（Gulrajani etal。 2017）、それぞれ。ドメイン変換では、ダイバーシティペナルティモジュールは、StarGAN（Choi etal。2018）がより正確なアテンションマスクを生成し、収束プロセスを加速するのに役立ちます。最後に、CelebA、CIFAR-10、MNIST、Fashion-MNISTでISとFIDを使用して提案された方法を定量的に評価します。結果は、ダイバーシティペナルティモジュールを使用したGANが、一部のSOTAGANアーキテクチャと比較してはるかに高いISと低いFIDを取得することを示しています。

The vanilla GAN (Goodfellow et al. 2014) suffers from mode collapse deeply, which usually manifests as that the images generated by generators tend to have a high similarity amongst them, even though their corresponding latent vectors have been very different. In this paper, we introduce a pluggable diversity penalty module (DPM) to alleviate mode collapse of GANs. It reduces the similarity of image pairs in feature space, i.e., if two latent vectors are different, then we enforce the generator to generate two images with different features. The normalized Gram matrix is used to measure the similarity. We compare the proposed method with Unrolled GAN (Metz et al. 2016), BourGAN (Xiao, Zhong, and Zheng 2018), PacGAN (Lin et al. 2018), VEEGAN (Srivastava et al. 2017) and ALI (Dumoulin et al. 2016) on 2D synthetic dataset, and results show that the diversity penalty module can help GAN capture much more modes of the data distribution. Further, in classification tasks, we apply this method as image data augmentation on MNIST, Fashion- MNIST and CIFAR-10, and the classification testing accuracy is improved by 0.24%, 1.34% and 0.52% compared with WGAN GP (Gulrajani et al. 2017), respectively. In domain translation, diversity penalty module can help StarGAN (Choi et al. 2018) generate more accurate attention masks and accelarate the convergence process. Finally, we quantitatively evaluate the proposed method with IS and FID on CelebA, CIFAR-10, MNIST and Fashion-MNIST, and the results suggest GAN with diversity penalty module gets much higher IS and lower FID compared with some SOTA GAN architectures.

updated: Tue Aug 24 2021 11:52:40 GMT+0000 (UTC)

published: Thu Aug 05 2021 03:41:14 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト