Adaptive Sample Selection for Robust Learning under Label Noise

Deep Patel; P. S. Sastry

ラベルノイズ下でのロバスト学習のための適応サンプル選択

ディープニューラルネットワーク (DNN) は、ノイズの多いラベルが付けられたデータの存在下で、記憶やオーバーフィッティングの影響を受けやすいことが示されています。このようなノイズの多いデータの下でのロバストな学習の問題に対して、いくつかのアルゴリズムが提案されています。主要なクラスのアルゴリズムは、基本的に、特定のしきい値を下回る損失値を持つサンプルの一部がトレーニング用に選択されるサンプル選択戦略に依存しています。これらのアルゴリズムは、このようなしきい値に敏感であり、これらのしきい値を修正または学習することは困難です。多くの場合、これらのアルゴリズムは、実際には通常利用できないラベルノイズレートなどの情報も必要とします。この論文では、ラベルノイズに対するロバスト性を提供するために、特定のミニバッチのバッチ統計のみに依存する適応サンプル選択戦略を提案します。このアルゴリズムには、サンプル選択のための追加のハイパーパラメーターはなく、ノイズ率に関する情報も必要なく、クリーンラベルの付いた個別のデータにアクセスする必要もありません。ベンチマークデータセットに対するアルゴリズムの有効性を経験的に示します。

Deep Neural Networks (DNNs) have been shown to be susceptible to memorization or overfitting in the presence of noisily-labelled data. For the problem of robust learning under such noisy data, several algorithms have been proposed. A prominent class of algorithms rely on sample selection strategies wherein, essentially, a fraction of samples with loss values below a certain threshold are selected for training. These algorithms are sensitive to such thresholds, and it is difficult to fix or learn these thresholds. Often, these algorithms also require information such as label noise rates which are typically unavailable in practice. In this paper, we propose an adaptive sample selection strategy that relies only on batch statistics of a given mini-batch to provide robustness against label noise. The algorithm does not have any additional hyperparameters for sample selection, does not need any information on noise rates and does not need access to separate data with clean labels. We empirically demonstrate the effectiveness of our algorithm on benchmark datasets.

updated: Mon Dec 05 2022 07:05:20 GMT+0000 (UTC)

published: Tue Jun 29 2021 12:10:58 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト