Semantic Image Synthesis via Efficient Class-Adaptive Normalization

Zhentao Tan; Dongdong Chen; Qi Chu; Menglei Chai; Jing Liao; Mingming He; Lu Yuan; Gang Hua; Nenghai Yu

効率的なクラス適応正規化による意味画像合成

空間適応正規化（SPADE）は、意味情報が洗い流されるのを防ぐために、意味レイアウトから学習した空間的に変化する変換で正規化されたアクティブ化を変調する条件付き意味画像合成で最近非常に成功しています。その印象的なパフォーマンスにもかかわらず、この新しい構造によってもたらされる重要な計算とパラメータのオーバーヘッドを削減するために、ボックス内の利点をより完全に理解することが依然として強く求められています。このホワイトペーパーでは、投資収益率の観点から、この空間適応型正規化の有効性を詳細に分析し、その変調パラメータが、特に空間適応性よりも意味認識の恩恵を受けていることを確認します。高解像度の入力マスク用。この観察に触発されて、クラス適応正規化（CLADE）を提案します。これは、セマンティッククラスにのみ適応する、軽量でありながら同等に効果的なバリアントです。空間適応性をさらに向上させるために、セマンティックレイアウトから計算されたクラス内位置マップエンコーディングを導入してCLADEの正規化パラメーターを変調し、CLADEの真の空間適応バリアントであるCLADE-ICPEを提案します。この設計の恩恵を受けて、CLADEは計算コストを大幅に削減すると同時に、世代内のセマンティック情報を保持することができます。複数の挑戦的なデータセットでの広範な実験を通じて、提案されたCLADEは、SPADEと比較して同等の生成品質を達成しながら、さまざまなSPADEベースの方法に一般化できることを示しますが、余分なパラメーターが少なく、計算コストが低く、はるかに効率的です。コードはhttps://github.com/tzt101/CLADE.gitで入手できます。

Spatially-adaptive normalization (SPADE) is remarkably successful recently in conditional semantic image synthesis, which modulates the normalized activation with spatially-varying transformations learned from semantic layouts, to prevent the semantic information from being washed away. Despite its impressive performance, a more thorough understanding of the advantages inside the box is still highly demanded to help reduce the significant computation and parameter overhead introduced by this novel structure. In this paper, from a return-on-investment point of view, we conduct an in-depth analysis of the effectiveness of this spatially-adaptive normalization and observe that its modulation parameters benefit more from semantic-awareness rather than spatial-adaptiveness, especially for high-resolution input masks. Inspired by this observation, we propose class-adaptive normalization (CLADE), a lightweight but equally-effective variant that is only adaptive to semantic class. In order to further improve spatial-adaptiveness, we introduce intra-class positional map encoding calculated from semantic layouts to modulate the normalization parameters of CLADE and propose a truly spatially-adaptive variant of CLADE, namely CLADE-ICPE. %Benefiting from this design, CLADE greatly reduces the computation cost while being able to preserve the semantic information in the generation. Through extensive experiments on multiple challenging datasets, we demonstrate that the proposed CLADE can be generalized to different SPADE-based methods while achieving comparable generation quality compared to SPADE, but it is much more efficient with fewer extra parameters and lower computational cost. The code is available at https://github.com/tzt101/CLADE.git

updated: Tue Dec 08 2020 18:59:32 GMT+0000 (UTC)

published: Tue Dec 08 2020 18:59:32 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト