On Algorithmic Stability in Unsupervised Representation Learning

Matthew Willetts; Brooks Paige

教師なし表現学習におけるアルゴリズムの安定性について

この論文では、同じ入力データで繰り返される再トレーニングの関数として、深い生成モデルを使用した教師なし表現学習のアルゴリズムの安定性を調査します。低次元線形表現を学習するためのアルゴリズム（たとえば、主成分分析（PCA）または線形独立成分分析（ICA））には、常に同じ潜在的表現（おそらく任意の回転または順列まで）が明らかになることが保証されています。。残念ながら、確率的勾配降下法によってトレーニングされた変分オートエンコーダー（VAE）モデルなどの非線形表現学習の場合、そのような保証はありません。非線形ICAの識別可能性に関する最近の研究では、サイド情報（情報ラベルなど）の条件付けによって達成される、識別可能な潜在的表現を持つ深い生成モデルのファミリーが導入されました。パラメータの再推定を繰り返した場合のこれらのモデルの安定性を経験的に評価し、それらを標準のVAEと潜在空間でクラスター化することを学習する深い生成モデルの両方と比較します。驚くべきことに、アルゴリズムの安定性に副次的な情報は必要ないことがわかりました。識別可能性の標準的な定量的尺度を使用すると、潜在的なクラスタリングを備えた深い生成モデルが、補助ラベルに依存するモデルと同程度に経験的に識別可能であることがわかります。これらの結果を、識別可能な非線形ICAの可能性に関連付けます。

In this paper, we investigate the algorithmic stability of unsupervised representation learning with deep generative models, as a function of repeated re-training on the same input data. Algorithms for learning low dimensional linear representations -- for example principal components analysis (PCA), or linear independent components analysis (ICA) -- come with guarantees that they will always reveal the same latent representations (perhaps up to an arbitrary rotation or permutation). Unfortunately, for non-linear representation learning, such as in a variational auto-encoder (VAE) model trained by stochastic gradient descent, we have no such guarantees. Recent work on identifiability in non-linear ICA have introduced a family of deep generative models that have identifiable latent representations, achieved by conditioning on side information (e.g. informative labels). We empirically evaluate the stability of these models under repeated re-estimation of parameters, and compare them to both standard VAEs and deep generative models which learn to cluster in their latent space. Surprisingly, we discover side information is not necessary for algorithmic stability: using standard quantitative measures of identifiability, we find deep generative models with latent clusterings are empirically identifiable to the same degree as models which rely on auxiliary labels. We relate these results to the possibility of identifiable non-linear ICA.

updated: Fri May 20 2022 17:01:15 GMT+0000 (UTC)

published: Wed Jun 09 2021 17:22:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト