Self-Distilled StyleGAN: Towards Generation from Internet Photos

Ron Mokady; Michal Yarom; Omer Tov; Oran Lang; Daniel Cohen-Or; Tali Dekel; Michal Irani; Inbar Mosseri

自己蒸留StyleGAN：インターネット写真からの生成に向けて

StyleGANは、前例のないセマンティック編集を提供しながら、忠実度の高い画像を生成することで知られています。ただし、これらの魅力的な機能は、限られたデータセットのセットでのみ実証されています。これらのデータセットは通常、構造的に調整され、適切に管理されています。この論文では、インターネットから収集された未加工の未キュレーション画像を処理するためにStyleGANをどのように適合させることができるかを示します。このような画像コレクションは、StyleGANに2つの主要な課題を課します。それらは多くの外れ値画像を含み、マルチモーダル分布によって特徴付けられます。このような生の画像コレクションでStyleGANをトレーニングすると、画像合成の品質が低下します。これらの課題に対処するために、StyleGANベースの自己蒸留アプローチを提案しました。これは2つの主要なコンポーネントで構成されています。（i）適切なトレーニングセットを生成するために、データセットの生成ベースの自己フィルタリングによって外れ値の画像を排除します。（ii）生成された画像の知覚的クラスタリングにより、固有のデータモダリティを検出します。これらのモダリティは、画像合成プロセスにおけるStyleGANの「切り捨てトリック」を改善するために使用されます。提示された技術は、データの多様性の損失を最小限に抑えながら、高品質の画像の生成を可能にします。定性的および定量的評価を通じて、インターネットから収集された新しい挑戦的で多様なドメインへのアプローチの力を示します。新しいデータセットと事前トレーニング済みモデルは、https：//self-distilled-stylegan.github.io/で入手できます。

StyleGAN is known to produce high-fidelity images, while also offering unprecedented semantic editing. However, these fascinating abilities have been demonstrated only on a limited set of datasets, which are usually structurally aligned and well curated. In this paper, we show how StyleGAN can be adapted to work on raw uncurated images collected from the Internet. Such image collections impose two main challenges to StyleGAN: they contain many outlier images, and are characterized by a multi-modal distribution. Training StyleGAN on such raw image collections results in degraded image synthesis quality. To meet these challenges, we proposed a StyleGAN-based self-distillation approach, which consists of two main components: (i) A generative-based self-filtering of the dataset to eliminate outlier images, in order to generate an adequate training set, and (ii) Perceptual clustering of the generated images to detect the inherent data modalities, which are then employed to improve StyleGAN's "truncation trick" in the image synthesis process. The presented technique enables the generation of high-quality images, while minimizing the loss in diversity of the data. Through qualitative and quantitative evaluation, we demonstrate the power of our approach to new challenging and diverse domains collected from the Internet. New datasets and pre-trained models are available at https://self-distilled-stylegan.github.io/ .

updated: Thu Feb 24 2022 17:16:47 GMT+0000 (UTC)

published: Thu Feb 24 2022 17:16:47 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト