Latent Space is Feature Space: Regularization Term for GANs Training on Limited Dataset

Pengwei Wang

潜在空間は特徴空間です: 限られたデータセットでトレーニングする GANs の正則化用語

Generative Adversarial Networks (GAN) は現在、教師なし画像生成方法として広く使用されています。現在の最先端の GAN は、高解像度でフォトリアリスティックな画像を生成できます。ただし、大量のデータが必要になるか、モデルが同様のパターン (モード崩壊) と悪い品質の画像を生成する傾向があります。 LFM と呼ばれる GAN の追加の構造と損失関数を提案しました。これは、画質に影響を与えることなくモード崩壊を回避するために、潜在空間の異なる次元間の特徴の多様性を最大化するようにトレーニングされています。直交する潜在ベクトルのペアが作成され、弁別器によって抽出された特徴ベクトルのペアが内積によって調べられます。これにより、弁別器とジェネレータは新しい敵対関係にあります。実験では、このシステムは DCGAN に基づいて構築されており、CelebA データセットで最初からフレシェインセプションディスタンス (FID) トレーニングを改善することが証明されています。このシステムは、多少の余分なパフォーマンスを必要とし、データ拡張メソッドと連携できます。コードは github.com/penway/LFM で入手できます。

Generative Adversarial Networks (GAN) is currently widely used as an unsupervised image generation method. Current state-of-the-art GANs can generate photorealistic images with high resolution. However, a large amount of data is required, or the model would prone to generate images with similar patterns (mode collapse) and bad quality. I proposed an additional structure and loss function for GANs called LFM, trained to maximize the feature diversity between the different dimensions of the latent space to avoid mode collapse without affecting the image quality. Orthogonal latent vector pairs are created, and feature vector pairs extracted by discriminator are examined by dot product, with which discriminator and generator are in a novel adversarial relationship. In experiments, this system has been built upon DCGAN and proved to have improvement on Frechet Inception Distance (FID) training from scratch on CelebA Dataset. This system requires mild extra performance and can work with data augmentation methods. The code is available on github.com/penway/LFM.

updated: Fri Oct 28 2022 16:34:48 GMT+0000 (UTC)

published: Fri Oct 28 2022 16:34:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト