Similarity and Generalization: From Noise to Corruption

Nayara Fonseca; Veronica Guidetti

類似性と一般化: ノイズから破損まで

対照学習は、類似したサンプルが互いに近く、異なるサンプルが遠く離れている埋め込み表現を見つけることによって、データから特徴的な特徴を抽出することを目的としています。 NN がノイズの存在下で類似性の概念を一般化する方法を研究し、二重降下 (DD) 動作とオンライン/オフライン対応の 2 つの現象を調査します。 DD では、長いトレーニング時間またはパラメーターの数を増やすことによって、ネットワークがデータセットにどのように適応するかを調べますが、オンライン/オフラインの対応では、データセットの品質 (多様性) を変化させてネットワークのパフォーマンスを比較します。最も単純な対照学習の代表であるシャムニューラルネットワーク (SNN) に焦点を当てます。 SNN は、ペアラベルノイズ (PLN) とシングルラベルノイズ (SLN) という 2 つの異なるノイズ源の影響を受ける可能性があることを指摘します。 SLN の効果は非対称ですが、相似関係を維持しますが、PLN は対称ですが推移性を壊します。 DD は SNN にも現れ、ノイズによって悪化することがわかりました。データセットのトポロジが一般化に大きく影響することを示します。まばらなデータセットは、同じ量のノイズに対して SLN と PLN の下で同じパフォーマンスを示しますが、密なデータセットの過剰にパラメータ化された領域では、SLN が PLN よりも優れています。実際、この体制では、PLN 類似性違反は巨視的になり、完全なオーバーフィッティングを達成できない点までデータセットが破損します。この現象を密度誘起類似性の破れ (DIBS) と呼びます。 SNN におけるオンライン最適化とオフライン一般化の同等性を調べたところ、検討したすべてのシナリオでラベルノイズが存在すると、それらの対応が崩れることがわかりました。

Contrastive learning aims to extract distinctive features from data by finding an embedding representation where similar samples are close to each other, and different ones are far apart. We study how NNs generalize the concept of similarity in the presence of noise, investigating two phenomena: Double Descent (DD) behavior and online/offline correspondence. While DD examines how the network adjusts to the dataset during a long training time or by increasing the number of parameters, online/offline correspondence compares the network performances varying the quality (diversity) of the dataset. We focus on the simplest contrastive learning representative: Siamese Neural Networks (SNNs). We point out that SNNs can be affected by two distinct sources of noise: Pair Label Noise (PLN) and Single Label Noise (SLN). The effect of SLN is asymmetric, but it preserves similarity relations, while PLN is symmetric but breaks transitivity. We find that DD also appears in SNNs and is exacerbated by noise. We show that the dataset topology crucially affects generalization. While sparse datasets show the same performances under SLN and PLN for an equal amount of noise, SLN outperforms PLN in the overparametrized region in dense datasets. Indeed, in this regime, PLN similarity violation becomes macroscopical, corrupting the dataset to the point where complete overfitting cannot be achieved. We call this phenomenon Density-Induced Break of Similarity (DIBS). Probing the equivalence between online optimization and offline generalization in SNNs, we find that their correspondence breaks down in the presence of label noise for all the scenarios considered.

updated: Thu Oct 13 2022 18:05:30 GMT+0000 (UTC)

published: Sun Jan 30 2022 12:53:51 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト