Semantic Image Synthesis with Spatially-Adaptive Normalization

Taesung Park; Ming-Yu Liu; Ting-Chun Wang; Jun-Yan Zhu

空間適応正規化によるセマンティック画像合成

入力セマンティックレイアウトを与えられたフォトリアリスティックな画像を合成するためのシンプルだが効果的なレイヤーである空間適応正規化を提案します。以前のメソッドは、セマンティックレイアウトを入力として直接ディープネットワークに送り、その後、畳み込み、正規化、および非線形層のスタックを介して処理します。正規化レイヤーはセマンティック情報を「洗い流す」傾向があるため、これは最適ではないことを示しています。この問題を解決するために、入力レイアウトを使用して、空間的に適応した学習変換を通じて正規化レイヤーのアクティベーションを変調することを提案します。いくつかの挑戦的なデータセットでの実験は、視覚的な忠実度と入力レイアウトとの整列の両方に関して、既存のアプローチに対する提案された方法の利点を示しています。最後に、このモデルでは、セマンティックとスタイルの両方をユーザーが制御できます。コードはhttps://github.com/NVlabs/SPADEで入手できます。

We propose spatially-adaptive normalization, a simple but effective layer for synthesizing photorealistic images given an input semantic layout. Previous methods directly feed the semantic layout as input to the deep network, which is then processed through stacks of convolution, normalization, and nonlinearity layers. We show that this is suboptimal as the normalization layers tend to ``wash away'' semantic information. To address the issue, we propose using the input layout for modulating the activations in normalization layers through a spatially-adaptive, learned transformation. Experiments on several challenging datasets demonstrate the advantage of the proposed method over existing approaches, regarding both visual fidelity and alignment with input layouts. Finally, our model allows user control over both semantic and style. Code is available at https://github.com/NVlabs/SPADE .

updated: Tue Nov 05 2019 15:41:27 GMT+0000 (UTC)

published: Mon Mar 18 2019 08:12:23 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト