Generative Adversarial Transformers

Drew A. Hudson; C. Lawrence Zitnick

生成的敵対的トランスフォーマー

斬新で効率的なタイプのトランスフォーマーであるGANsformerを紹介し、視覚生成モデリングのタスクのためにそれを探索します。ネットワークは、線形効率の計算を維持しながら、画像全体の長距離相互作用を可能にする2部構造を採用しており、高解像度の合成に容易に拡張できます。潜在変数のセットから進化する視覚的特徴に、またはその逆に情報を繰り返し伝播し、互いに照らしてそれぞれの洗練をサポートし、オブジェクトとシーンの構成表現の出現を促進します。従来のトランスアーキテクチャとは対照的に、柔軟な領域ベースの変調を可能にする乗法積分を利用しているため、成功したStyleGANネットワークの一般化と見なすことができます。シミュレートされたマルチオブジェクト環境から豊かな現実世界の屋内および屋外シーンまで、さまざまなデータセットを注意深く評価することでモデルの強度と堅牢性を実証し、画質と多様性を享受しながら、迅速な学習とより優れたデータ効率を享受します。さらなる定性的および定量的実験は、モデルの内部動作への洞察を提供し、解釈可能性の向上とより強力な解きほぐしを明らかにし、私たちのアプローチの利点と有効性を示します。モデルの実装はhttps://github.com/dorarad/gansformerで入手できます。

We introduce the GANsformer, a novel and efficient type of transformer, and explore it for the task of visual generative modeling. The network employs a bipartite structure that enables long-range interactions across the image, while maintaining computation of linearly efficiency, that can readily scale to high-resolution synthesis. It iteratively propagates information from a set of latent variables to the evolving visual features and vice versa, to support the refinement of each in light of the other and encourage the emergence of compositional representations of objects and scenes. In contrast to the classic transformer architecture, it utilizes multiplicative integration that allows flexible region-based modulation, and can thus be seen as a generalization of the successful StyleGAN network. We demonstrate the model's strength and robustness through a careful evaluation over a range of datasets, from simulated multi-object environments to rich real-world indoor and outdoor scenes, showing it achieves state-of-the-art results in terms of image quality and diversity, while enjoying fast learning and better data-efficiency. Further qualitative and quantitative experiments offer us an insight into the model's inner workings, revealing improved interpretability and stronger disentanglement, and illustrating the benefits and efficacy of our approach. An implementation of the model is available at https://github.com/dorarad/gansformer.

updated: Tue Mar 02 2021 18:39:04 GMT+0000 (UTC)

published: Mon Mar 01 2021 18:54:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト