A U-Net Based Discriminator for Generative Adversarial Networks

Edgar Schönfeld; Bernt Schiele; Anna Khoreva

生成的敵対的ネットワークのためのU-Netベースの弁別器

生成的敵対的ネットワーク（GAN）の主な残りの課題の中には、実際の画像と区別がつかないオブジェクトの形状とテクスチャを使用して、グローバルおよびローカルでコヒーレントな画像を合成する能力があります。この問題をターゲットにするために、セグメンテーションの文献からの洞察を借りて、代替のU-Netベースのディスクリミネーターアーキテクチャを提案します。提案されたU-Netベースのアーキテクチャでは、グローバル画像フィードバックも提供することにより、合成画像のグローバルコヒーレンスを維持しながら、ジェネレータに詳細なピクセルごとのフィードバックを提供できます。弁別器のピクセルごとの応答によって強化され、CutMixデータ拡張に基づくピクセルごとの整合性正則化手法をさらに提案し、U-Net弁別器が実際の画像と偽の画像の間の意味的および構造的変化にさらに焦点を当てるように促します。これにより、U-Net弁別器トレーニングが改善され、生成されたサンプルの品質がさらに向上します。新しい弁別器は、標準分布と画質メトリックの点で最先端技術を改善し、ジェネレータがさまざまな構造、外観、詳細レベルで画像を合成できるようにし、グローバルおよびローカルのリアリズムを維持します。 BigGANベースラインと比較して、FFHQ、CelebA、および新しく導入されたCOCO-Animalsデータセット全体で平均2.7FIDポイントの改善を達成しています。コードはhttps://github.com/boschresearch/unetganで入手できます。

Among the major remaining challenges for generative adversarial networks (GANs) is the capacity to synthesize globally and locally coherent images with object shapes and textures indistinguishable from real images. To target this issue we propose an alternative U-Net based discriminator architecture, borrowing the insights from the segmentation literature. The proposed U-Net based architecture allows to provide detailed per-pixel feedback to the generator while maintaining the global coherence of synthesized images, by providing the global image feedback as well. Empowered by the per-pixel response of the discriminator, we further propose a per-pixel consistency regularization technique based on the CutMix data augmentation, encouraging the U-Net discriminator to focus more on semantic and structural changes between real and fake images. This improves the U-Net discriminator training, further enhancing the quality of generated samples. The novel discriminator improves over the state of the art in terms of the standard distribution and image quality metrics, enabling the generator to synthesize images with varying structure, appearance and levels of detail, maintaining global and local realism. Compared to the BigGAN baseline, we achieve an average improvement of 2.7 FID points across FFHQ, CelebA, and the newly introduced COCO-Animals dataset. The code is available at https://github.com/boschresearch/unetgan.

updated: Fri Mar 19 2021 23:22:06 GMT+0000 (UTC)

published: Fri Feb 28 2020 11:16:54 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト