Combating Mode Collapse in GANs via Manifold Entropy Estimation

Haozhe Liu; Bing Li; Haoqian Wu; Hanbang Liang; Yawen Huang; Yuexiang Li; Bernard Ghanem; Yefeng Zheng

多様体エントロピー推定によるGANのモード崩壊との戦い

Generative Adversarial Networks (GAN) は、近年、さまざまなタスクやアプリケーションで説得力のある結果を示しています。ただし、モードの崩壊は GAN の重大な問題のままです。この論文では、GANのモード崩壊の問題に対処するための新しいトレーニングパイプラインを提案します。既存の方法とは異なり、弁別器を特徴埋め込みとして一般化し、弁別器によって学習された埋め込み空間内の分布のエントロピーを最大化することを提案します。具体的には、Deep Local Linear Embedding (DLLE) と Deep Isometric feature Mapping (DIsoMap) という 2 つの正則化用語は、ディスクリミネーターがデータに埋め込まれた構造情報を学習するように設計されており、ディスクリミネーターが学習した埋め込み空間を適切に使用できます。形成された。ディスクリミネーターによってサポートされる十分に学習された埋め込み空間に基づいて、生成された分布のエントロピーを最大化する近似として再生され、埋め込みベクトルのエントロピーを効率的に最大化するようにノンパラメトリックエントロピー推定器が設計されます。ディスクリミネーターを改善し、埋め込み空間で最も類似したサンプルの距離を最大化することにより、パイプラインは、生成されたサンプルの品質を犠牲にすることなく、モードの崩壊を効果的に減らします。広範な実験結果は、GAN ベースラインである CelebA の MaF-GAN (FID で 9.13 対 12.43) よりも優れており、ANIME-FACE データセットでの最近の最先端のエネルギーベースのモデル (インセプションスコアで 2.80 対 2.26)。

Generative Adversarial Networks (GANs) have shown compelling results in various tasks and applications in recent years. However, mode collapse remains a critical problem in GANs. In this paper, we propose a novel training pipeline to address the mode collapse issue of GANs. Different from existing methods, we propose to generalize the discriminator as feature embedding, and maximize the entropy of distributions in the embedding space learned by the discriminator. Specifically, two regularization terms, i.e.Deep Local Linear Embedding (DLLE) and Deep Isometric feature Mapping (DIsoMap), are designed to encourage the discriminator to learn the structural information embedded in the data, such that the embedding space learned by the discriminator can be well formed. Based on the well-learned embedding space supported by the discriminator, a non-parametric entropy estimator is designed to efficiently maximize the entropy of embedding vectors, playing as an approximation of maximizing the entropy of the generated distribution. Through improving the discriminator and maximizing the distance of the most similar samples in the embedding space, our pipeline effectively reduces the mode collapse without sacrificing the quality of generated samples. Extensive experimental results show the effectiveness of our method which outperforms the GAN baseline, MaF-GAN on CelebA (9.13 vs. 12.43 in FID) and surpasses the recent state-of-the-art energy-based model on the ANIME-FACE dataset (2.80 vs. 2.26 in Inception score).

updated: Wed Nov 23 2022 09:26:33 GMT+0000 (UTC)

published: Thu Aug 25 2022 12:33:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト