Visual Representation Learning Does Not Generalize Strongly Within the Same Domain

Lukas Schott; Julius von Kügelgen; Frederik Träuble; Peter Gehler; Chris Russell; Matthias Bethge; Bernhard Schölkopf; Francesco Locatello; Wieland Brendel

視覚表現学習は、同じドメイン内で強く一般化されません

機械学習を一般化するための重要な要素は、変動の潜在的な要因と、各要因が世界で作用するメカニズムを明らかにすることです。このホワイトペーパーでは、17の教師なし、弱教師あり、および完全教師あり表現学習アプローチが、単純なデータセット（dSprites、Shapes3D、MPI3D）の変動の生成要因を正しく推測するかどうかをテストします。ブラーやその他の（非）構造化ノイズなど、テスト時間中に新しい変動要因を導入する以前のロバスト性作業とは対照的に、ここでは、トレーニングデータセットから既存の変動要因のみを再構成、内挿、または外挿します（たとえば、小さいトレーニング中は中型のオブジェクト、テスト中は大型のオブジェクト）。正しいメカニズムを学習するモデルは、このベンチマークに一般化できるはずです。合計で、2000以上のモデルをトレーニングおよびテストし、監視信号やアーキテクチャのバイアスに関係なく、すべてのモデルが基礎となるメカニズムの学習に苦労していることを確認しています。さらに、人工データセットからより現実的な実世界のデータセットに移行すると、テストされたすべてのモデルの一般化機能が大幅に低下します。正しいメカニズムを特定できないにもかかわらず、モデルは非常にモジュール化されており、他の分布内の要因を推測する能力はかなり安定しており、単一の要因のみが分布外になっています。これらの結果は、一般化を容易にすることができる観測の機械論的モデルを学習するという重要であるが十分に研究されていない問題を示しています。

An important component for generalization in machine learning is to uncover underlying latent factors of variation as well as the mechanism through which each factor acts in the world. In this paper, we test whether 17 unsupervised, weakly supervised, and fully supervised representation learning approaches correctly infer the generative factors of variation in simple datasets (dSprites, Shapes3D, MPI3D). In contrast to prior robustness work that introduces novel factors of variation during test time, such as blur or other (un)structured noise, we here recompose, interpolate, or extrapolate only existing factors of variation from the training data set (e.g., small and medium-sized objects during training and large objects during testing). Models that learn the correct mechanism should be able to generalize to this benchmark. In total, we train and test 2000+ models and observe that all of them struggle to learn the underlying mechanism regardless of supervision signal and architectural bias. Moreover, the generalization capabilities of all tested models drop significantly as we move from artificial datasets towards more realistic real-world datasets. Despite their inability to identify the correct mechanism, the models are quite modular as their ability to infer other in-distribution factors remains fairly stable, providing only a single factor is out-of-distribution. These results point to an important yet understudied problem of learning mechanistic models of observations that can facilitate generalization.

updated: Sat Jul 17 2021 11:24:18 GMT+0000 (UTC)

published: Sat Jul 17 2021 11:24:18 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト