Soft-IntroVAE: Analyzing and Improving the Introspective Variational Autoencoder

Tal Daniel; Aviv Tamar

Soft-IntroVAE：内省的変分オートエンコーダーの分析と改善

最近導入されたイントロスペクティブ変分オートエンコーダー（IntroVAE）は、優れた画像生成を示し、画像エンコーダーを使用した償却推論を可能にします。 IntroVAEの主なアイデアは、VAEエンコーダーを使用して、生成されたデータサンプルと実際のデータサンプルを区別して、VAEを敵対的にトレーニングすることです。ただし、元のIntroVAE損失関数は、実際には安定化が非常に難しい特定のヒンジ損失定式化に依存しており、その理論的収束分析では損失の重要な用語が無視されていました。この作業では、IntroVAEモデル、その実際の実装、およびそのアプリケーションをよりよく理解するための一歩を踏み出します。 Soft-IntroVAEを提案します。これは、生成されたサンプルのヒンジ損失項を滑らかな指数損失に置き換える修正されたIntroVAEです。この変更により、トレーニングの安定性が大幅に向上し、完全なアルゴリズムの理論的分析も可能になります。興味深いことに、IntroVAEが、データ分布からのKL距離とエントロピー項の合計を最小化する分布に収束することを示します。この結果の意味を議論し、それが競争力のある画像の生成と再構成を誘発することを示しています。最後に、教師なし画像変換と分布外検出へのSoft-IntroVAEの2つのアプリケーションについて説明し、説得力のある結果を示します。コードと追加情報は、プロジェクトのWebサイト（https://taldatech.github.io/soft-intro-vae-web）で入手できます。

The recently introduced introspective variational autoencoder (IntroVAE) exhibits outstanding image generations, and allows for amortized inference using an image encoder. The main idea in IntroVAE is to train a VAE adversarially, using the VAE encoder to discriminate between generated and real data samples. However, the original IntroVAE loss function relied on a particular hinge-loss formulation that is very hard to stabilize in practice, and its theoretical convergence analysis ignored important terms in the loss. In this work, we take a step towards better understanding of the IntroVAE model, its practical implementation, and its applications. We propose the Soft-IntroVAE, a modified IntroVAE that replaces the hinge-loss terms with a smooth exponential loss on generated samples. This change significantly improves training stability, and also enables theoretical analysis of the complete algorithm. Interestingly, we show that the IntroVAE converges to a distribution that minimizes a sum of KL distance from the data distribution and an entropy term. We discuss the implications of this result, and demonstrate that it induces competitive image generation and reconstruction. Finally, we describe two applications of Soft-IntroVAE to unsupervised image translation and out-of-distribution detection, and demonstrate compelling results. Code and additional information is available on the project website -- https://taldatech.github.io/soft-intro-vae-web

updated: Thu Mar 25 2021 07:12:51 GMT+0000 (UTC)

published: Thu Dec 24 2020 13:53:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト