Towards Controllable and Photorealistic Region-wise Image Manipulation

Ansheng You; Chenglin Zhou; Qixuan Zhang; Lan Xu

制御可能でフォトリアリスティックな領域ごとの画像操作に向けて

適応性のある柔軟な画像編集は、現代の生成モデルの望ましい機能です。この作業では、地域ごとのスタイル操作のためのオートエンコーダアーキテクチャを備えた生成モデルを提示します。コードの一貫性の喪失を適用して、コンテンツとスタイルの潜在表現の間の明示的な解きほぐしを強制し、生成されたサンプルのコンテンツとスタイルを対応するコンテンツとスタイルの参照と一致させます。モデルは、前景の編集が背景のコンテンツに干渉しないように、コンテンツの配置の損失によっても制約されます。その結果、ユーザーによって提供された関心領域マスクが与えられると、私たちのモデルは前景の領域ごとのスタイル転送をサポートします。特に、私たちのモデルは、自己監視を除いて、セマンティックラベルなどの追加の注釈を受け取りません。広範な実験は、提案された方法の有効性を示し、地域ごとのスタイル編集、潜在空間補間、クロスドメインスタイルの転送など、さまざまなアプリケーションに対する提案されたモデルの柔軟性を示しています。

Adaptive and flexible image editing is a desirable function of modern generative models. In this work, we present a generative model with auto-encoder architecture for per-region style manipulation. We apply a code consistency loss to enforce an explicit disentanglement between content and style latent representations, making the content and style of generated samples consistent with their corresponding content and style references. The model is also constrained by a content alignment loss to ensure the foreground editing will not interfere background contents. As a result, given interested region masks provided by users, our model supports foreground region-wise style transfer. Specially, our model receives no extra annotations such as semantic labels except for self-supervision. Extensive experiments show the effectiveness of the proposed method and exhibit the flexibility of the proposed model for various applications, including region-wise style editing, latent space interpolation, cross-domain style transfer.

updated: Thu Aug 19 2021 13:29:45 GMT+0000 (UTC)

published: Thu Aug 19 2021 13:29:45 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト