Generative Models as a Data Source for Multiview Representation Learning

Ali Jahanian; Xavier Puig; Yonglong Tian; Phillip Isola

マルチビュー表現学習のデータソースとしての生成モデル

生成モデルは、トレーニングされたデータとほとんど区別がつかないように見える非常にリアルな画像を生成できるようになりました。これは疑問を投げかけます：十分な生成モデルがある場合でも、データセットが必要ですか？データから直接ではなく、ブラックボックス生成モデルから汎用の視覚的表現を学習する設定でこの質問を調査します。トレーニングデータにアクセスできない既成の画像ジェネレーターを前提として、このジェネレーターによって出力されたサンプルから表現をトレーニングします。ジェネレータの潜在空間を使用して同じセマンティックコンテンツの複数の「ビュー」を生成し、この設定に適用できるいくつかの表現学習方法を比較します。対照的な方法の場合、このマルチビューデータを使用して、正のペア（潜在空間で近く）と負のペア（潜在空間で遠く離れている）を識別できることを示します。結果として得られる表現は、実際のデータから直接学習した表現に匹敵するか、それよりも優れていることがわかりますが、優れたパフォーマンスには、適用されるサンプリング戦略とトレーニング方法に注意が必要です。生成モデルは、データセットの圧縮され整理されたコピーと見なすことができます。データセットがますます扱いにくくなったり、欠落したり、プライベートになったりする一方で、ますます多くの「モデル動物園」が急増する未来を想定しています。この論文は、そのような将来の視覚表現学習に対処するためのいくつかの技術を提案します。コードは、プロジェクトページhttps://ali-design.github.io/GenRep/で入手できます。

Generative models are now capable of producing highly realistic images that look nearly indistinguishable from the data on which they are trained. This raises the question: if we have good enough generative models, do we still need datasets? We investigate this question in the setting of learning general-purpose visual representations from a black-box generative model rather than directly from data. Given an off-the-shelf image generator without any access to its training data, we train representations from the samples output by this generator. We compare several representation learning methods that can be applied to this setting, using the latent space of the generator to generate multiple "views" of the same semantic content. We show that for contrastive methods, this multiview data can naturally be used to identify positive pairs (nearby in latent space) and negative pairs (far apart in latent space). We find that the resulting representations rival or even outperform those learned directly from real data, but that good performance requires care in the sampling strategy applied and the training method. Generative models can be viewed as a compressed and organized copy of a dataset, and we envision a future where more and more "model zoos" proliferate while datasets become increasingly unwieldy, missing, or private. This paper suggests several techniques for dealing with visual representation learning in such a future. Code is available on our project page https://ali-design.github.io/GenRep/.

updated: Tue Mar 15 2022 01:20:44 GMT+0000 (UTC)

published: Wed Jun 09 2021 17:54:55 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト