Consistency Regularization for Variational Auto-Encoders

Samarth Sinha; Adji B. Dieng

変分オートエンコーダーの一貫性正則化

変分オートエンコーダー (VAE) は、教師なし学習への強力なアプローチです。これらは、変分推論 (VI) を使用して潜在変数モデルでスケーラブルな近似事後推論を可能にします。 VAE は、データを入力として受け取るエンコーダーと呼ばれるディープニューラルネットワークによってパラメーター化された変分族を仮定します。このエンコーダーはすべての観測で共有され、推論のコストを償却します。ただし、VAE のエンコーダーには、与えられた観察とそのセマンティクスを保持する変換を異なる潜在表現にマッピングするという望ましくない特性があります。エンコーダーのこの「不一致」は、特にダウンストリームタスクの学習した表現の品質を低下させ、一般化にも悪影響を及ぼします。この論文では、VAE の一貫性を強化する正則化方法を提案します。この考え方は、観測値を条件付けたときの変分分布と、この観測値のランダムな意味保存変換を条件付けたときの変分分布の間のカルバック・ライブラー (KL) ダイバージェンスを最小化することです。この正則化は、任意の VAE に適用できます。私たちの実験では、それをいくつかのベンチマークデータセットの 4 つの異なる VAE バリアントに適用し、学習した表現の品質を常に改善するだけでなく、より一般化することも発見しました。特に、Nouveau Variational Auto-Encoder (NVAE) に適用すると、正規化手法により、MNIST および CIFAR-10 で最先端のパフォーマンスが得られます。また、この方法を 3D データに適用したところ、下流の分類タスクでの精度によって測定される優れた品質の表現を学習することがわかりました。

Variational auto-encoders (VAEs) are a powerful approach to unsupervised learning. They enable scalable approximate posterior inference in latent-variable models using variational inference (VI). A VAE posits a variational family parameterized by a deep neural network called an encoder that takes data as input. This encoder is shared across all the observations, which amortizes the cost of inference. However the encoder of a VAE has the undesirable property that it maps a given observation and a semantics-preserving transformation of it to different latent representations. This "inconsistency" of the encoder lowers the quality of the learned representations, especially for downstream tasks, and also negatively affects generalization. In this paper, we propose a regularization method to enforce consistency in VAEs. The idea is to minimize the Kullback-Leibler (KL) divergence between the variational distribution when conditioning on the observation and the variational distribution when conditioning on a random semantic-preserving transformation of this observation. This regularization is applicable to any VAE. In our experiments we apply it to four different VAE variants on several benchmark datasets and found it always improves the quality of the learned representations but also leads to better generalization. In particular, when applied to the Nouveau Variational Auto-Encoder (NVAE), our regularization method yields state-of-the-art performance on MNIST and CIFAR-10. We also applied our method to 3D data and found it learns representations of superior quality as measured by accuracy on a downstream classification task.

updated: Mon May 31 2021 10:26:32 GMT+0000 (UTC)

published: Mon May 31 2021 10:26:32 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト