Uncertainty in Contrastive Learning: On the Predictability of Downstream Performance

Shervin Ardeshir; Navid Azizan

対照的な学習における不確実性：下流のパフォーマンスの予測可能性について

今日の最先端の深層学習モデルのいくつかの優れたパフォーマンスは、大規模なデータセットでの大規模な（自己）教師あり対照的な事前トレーニングにある程度起因しています。対照的な学習では、ネットワークは正（類似）と負（非類似）のデータポイントのペアで提示され、各データポイントの埋め込みベクトル、つまり表現を見つけるようにトレーニングされます。これは、さまざまなダウンストリームタスク用にさらに微調整できます。これらのモデルを重要な意思決定システムに安全に展開するには、不確実性または信頼性の尺度をモデルに装備することが重要です。ただし、対照的なモデルをトレーニングするペアワイズの性質と、出力（抽象的な埋め込みベクトル）に絶対ラベルがないため、従来の不確実性推定手法をそのようなモデルに適応させることは簡単ではありません。この作業では、そのような表現の不確実性を意味のある方法で単一のデータポイントに対して定量化できるかどうかを調査します。言い換えると、事前にトレーニングされた埋め込みから直接、特定のデータポイントのダウンストリームパフォーマンスが予測可能かどうかを調査します。この目標は、埋め込みスペースでのトレーニングデータの分布を直接推定し、表現のローカルな一貫性を考慮することで達成できることを示します。私たちの実験は、埋め込みベクトルのこの不確実性の概念は、しばしばその下流の精度と強く相関することを示しています。

The superior performance of some of today's state-of-the-art deep learning models is to some extent owed to extensive (self-)supervised contrastive pretraining on large-scale datasets. In contrastive learning, the network is presented with pairs of positive (similar) and negative (dissimilar) datapoints and is trained to find an embedding vector for each datapoint, i.e., a representation, which can be further fine-tuned for various downstream tasks. In order to safely deploy these models in critical decision-making systems, it is crucial to equip them with a measure of their uncertainty or reliability. However, due to the pairwise nature of training a contrastive model, and the lack of absolute labels on the output (an abstract embedding vector), adapting conventional uncertainty estimation techniques to such models is non-trivial. In this work, we study whether the uncertainty of such a representation can be quantified for a single datapoint in a meaningful way. In other words, we explore if the downstream performance on a given datapoint is predictable, directly from its pre-trained embedding. We show that this goal can be achieved by directly estimating the distribution of the training data in the embedding space and accounting for the local consistency of the representations. Our experiments show that this notion of uncertainty for an embedding vector often strongly correlates with its downstream accuracy.

updated: Tue Jul 19 2022 15:44:59 GMT+0000 (UTC)

published: Tue Jul 19 2022 15:44:59 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト