Dual Projection Generative Adversarial Networks for Conditional Image Generation

Ligong Han; Martin Renqiang Min; Anastasis Stathopoulos; Yu Tian; Ruijiang Gao; Asim Kadav; Dimitris Metaxas

条件付き画像生成のためのデュアルプロジェクション生成的敵対的ネットワーク

条件付き生成的敵対的ネットワーク（cGAN）は、標準の無条件GANフレームワークを拡張して、サンプルから共同データラベル分布を学習し、忠実度の高い画像を生成できる強力な生成モデルとして確立されています。このようなモデルをトレーニングする際の課題は、クラス情報をそのジェネレーターとディスクリミネーターに適切に注入することにあります。弁別子の場合、クラス条件付けは、（1）入力としてラベルを直接組み込むか、（2）補助分類損失にラベルを含めることによって実現できます。この論文では、前者がクラス条件付きの偽物と実数のデータ分布P（image | class）（データマッチング）を直接整列させ、後者がデータ条件付きのクラス分布P（class | image）（labelマッチング）。クラスの分離可能性はサンプルの品質に直接変換されず、分類自体が本質的に困難な場合は負担になりますが、異なるクラスの特徴が同じポイントにマッピングされて分離できなくなった場合、弁別器はジェネレーターに有用なガイダンスを提供できません。この直感に動機付けられて、データマッチングとラベルマッチングのバランスをとることを学習するデュアルプロジェクションGAN（P2GAN）モデルを提案します。次に、f-divergenceを最小化することにより、偽の条件と実際の条件P（class | image）を直接整列させる、補助分類を使用した改良されたcGANモデルを提案します。ガウス（MoG）データセットと、CIFAR100、ImageNet、VGGFace2などのさまざまな実世界のデータセットの合成混合物に関する実験は、提案されたモデルの有効性を示しています。

Conditional Generative Adversarial Networks (cGANs) extend the standard unconditional GAN framework to learning joint data-label distributions from samples, and have been established as powerful generative models capable of generating high-fidelity imagery. A challenge of training such a model lies in properly infusing class information into its generator and discriminator. For the discriminator, class conditioning can be achieved by either (1) directly incorporating labels as input or (2) involving labels in an auxiliary classification loss. In this paper, we show that the former directly aligns the class-conditioned fake-and-real data distributions P(image|class) ( data matching), while the latter aligns data-conditioned class distributions P(class|image) ( label matching). Although class separability does not directly translate to sample quality and becomes a burden if classification itself is intrinsically difficult, the discriminator cannot provide useful guidance for the generator if features of distinct classes are mapped to the same point and thus become inseparable. Motivated by this intuition, we propose a Dual Projection GAN (P2GAN) model that learns to balance between data matching and label matching. We then propose an improved cGAN model with Auxiliary Classification that directly aligns the fake and real conditionals P(class|image) by minimizing their f-divergence. Experiments on a synthetic Mixture of Gaussian (MoG) dataset and a variety of real-world datasets including CIFAR100, ImageNet, and VGGFace2 demonstrate the efficacy of our proposed models.

updated: Mon Nov 29 2021 05:47:13 GMT+0000 (UTC)

published: Fri Aug 20 2021 06:10:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト