PetsGAN: Rethinking Priors for Single Image Generation

Zicheng Zhang; Yinglu Liu; Congying Han; Hailin Shi; Tiande Guo; Bowen Zhou

PetsGAN：単一画像生成の事前確率を再考する

単一画像生成（SIG）は、特定の単一画像と同様の視覚的コンテンツを持つ多様なサンプルを生成するものとして説明され、単一画像の内部パッチ分布を段階的に学習するGANのピラミッドを構築するSinGANによって最初に導入されました。また、幅広い画像操作タスクで大きな可能性を示しています。ただし、SinGANのパラダイムには、生成品質とトレーニング時間の点で制限があります。まず、高レベルの情報が不足しているため、SinGANは、シーンやテクスチャ画像の場合とは異なり、オブジェクト画像を適切に処理できません。第二に、個別のプログレッシブトレーニングスキームは時間がかかり、アーティファクトの蓄積を引き起こしやすいです。これらの問題に取り組むために、このホワイトペーパーでは、SIGの問題を掘り下げ、内部および外部の事前情報を十分に活用してSinGANを改善します。この論文の主な貢献は次のとおりです。1）正則化された潜在変数モデルをSIGに紹介します。私たちの知る限りでは、SIGの明確な定式化と最適化の目標を示すのは初めてであり、SIGの既存のすべての方法はこのモデルの特殊なケースと見なすことができます。 2）SinGANの問題を克服するために、新しい事前ベースのエンドツーエンドトレーニングGAN（PetsGAN）を設計します。私たちの方法は、時間のかかるプログレッシブトレーニングスキームを取り除き、エンドツーエンドでトレーニングすることができます。 3）生成された画質、多様性、トレーニング速度の両方で私たちの方法の優位性を示すために、豊富な定性的および定量的実験を構築します。さらに、私たちは他の画像操作タスク（例えば、スタイルの転送、調和）に私たちの方法を適用し、その結果は私たちの方法の有効性と効率をさらに証明します。

Single image generation (SIG), described as generating diverse samples that have similar visual content with the given single image, is first introduced by SinGAN which builds a pyramid of GANs to progressively learn the internal patch distribution of the single image. It also shows great potentials in a wide range of image manipulation tasks. However, the paradigm of SinGAN has limitations in terms of generation quality and training time. Firstly, due to the lack of high-level information, SinGAN cannot handle the object images well as it does on the scene and texture images. Secondly, the separate progressive training scheme is time-consuming and easy to cause artifact accumulation. To tackle these problems, in this paper, we dig into the SIG problem and improve SinGAN by fully-utilization of internal and external priors. The main contributions of this paper include: 1) We introduce to SIG a regularized latent variable model. To the best of our knowledge, it is the first time to give a clear formulation and optimization goal of SIG, and all the existing methods for SIG can be regarded as special cases of this model. 2) We design a novel Prior-based end-to-end training GAN (PetsGAN) to overcome the problems of SinGAN. Our method gets rid of the time-consuming progressive training scheme and can be trained end-to-end. 3) We construct abundant qualitative and quantitative experiments to show the superiority of our method on both generated image quality, diversity, and the training speed. Moreover, we apply our method to other image manipulation tasks (e.g., style transfer, harmonization), and the results further prove the effectiveness and efficiency of our method.

updated: Thu Mar 03 2022 02:31:50 GMT+0000 (UTC)

published: Thu Mar 03 2022 02:31:50 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト