Layout-to-Image Translation with Double Pooling Generative Adversarial Networks

Hao Tang; Nicu Sebe

ダブルプーリング生成的敵対的ネットワークによるレイアウトから画像への変換

この論文では、入力セマンティックレイアウトを現実的な画像に変換することを目的とした、レイアウトから画像への変換のタスクについて説明します。既存の方法で広く観察されている未解決の課題の1つは、画像変換プロセス中に効果的なセマンティック制約がないことです。これにより、モデルがセマンティック情報を保持できず、同じオブジェクト内のセマンティック依存関係を無視できなくなります。この問題に対処するために、入力レイアウトからフォトリアリスティックで意味的に一貫した結果を生成するための新しいダブルプーイングGAN（DPGAN）を提案します。また、正方形のプーリングモジュール（SPM）と長方形のプーリングモジュール（RPM）で構成される新しいダブルプーリングモジュール（DPM）を提案します。具体的には、SPMは、さまざまな空間スケールで入力レイアウトの短距離のセマンティック依存関係をキャプチャすることを目的としていますが、RPMは、水平方向と垂直方向の両方から長距離のセマンティック依存関係をキャプチャすることを目的としています。次に、SPMとRPMの両方の出力を効果的に融合して、ジェネレーターの受容野をさらに拡大します。 5つの人気のあるデータセットでの広範な実験は、提案されたDPGANが最先端の方法よりも優れた結果を達成することを示しています。最後に、SPMとSPMはどちらも一般的であり、GANベースのアーキテクチャにシームレスに統合して、機能の表現を強化できます。コードはhttps://github.com/Ha0Tang/DPGANで入手できます。

In this paper, we address the task of layout-to-image translation, which aims to translate an input semantic layout to a realistic image. One open challenge widely observed in existing methods is the lack of effective semantic constraints during the image translation process, leading to models that cannot preserve the semantic information and ignore the semantic dependencies within the same object. To address this issue, we propose a novel Double Pooing GAN (DPGAN) for generating photo-realistic and semantically-consistent results from the input layout. We also propose a novel Double Pooling Module (DPM), which consists of the Square-shape Pooling Module (SPM) and the Rectangle-shape Pooling Module (RPM). Specifically, SPM aims to capture short-range semantic dependencies of the input layout with different spatial scales, while RPM aims to capture long-range semantic dependencies from both horizontal and vertical directions. We then effectively fuse both outputs of SPM and RPM to further enlarge the receptive field of our generator. Extensive experiments on five popular datasets show that the proposed DPGAN achieves better results than state-of-the-art methods. Finally, both SPM and SPM are general and can be seamlessly integrated into any GAN-based architectures to strengthen the feature representation. The code is available at https://github.com/Ha0Tang/DPGAN.

updated: Sun Aug 29 2021 19:55:14 GMT+0000 (UTC)

published: Sun Aug 29 2021 19:55:14 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト