BadLabel: A Robust Perspective on Evaluating and Enhancing Label-noise Learning

Jingfeng Zhang; Bo Song; Haohan Wang; Bo Han; Tongliang Liu; Lei Liu; Masashi Sugiyama

BadLabel: ラベルノイズ学習の評価と強化に関する確かな視点

ラベルノイズ学習 (LNL) は、ノイズのあるラベルを含むトレーニングデータを考慮してモデルの一般化を高めることを目的としています。実用的な LNL アルゴリズムを容易にするために、研究者は、クラス条件付きノイズからインスタンス依存ノイズまで、さまざまなラベルノイズタイプを提案しました。この論文では、BadLabel と呼ばれる新しいラベルノイズタイプを紹介します。これは、既存の LNL アルゴリズムのパフォーマンスを大幅に低下させる可能性があります。 BadLabel は、標準分類に対するラベル反転攻撃に基づいて作成されています。この攻撃では、特定のサンプルが選択され、そのラベルが他のラベルに反転され、クリーンなラベルとノイズのあるラベルの損失値が区別できなくなります。 BadLabel によってもたらされる課題に対処するために、各エポックで敵対的な方法でラベルに摂動を与え、クリーンなラベルとノイズのあるラベルの損失値を再び区別できるようにする堅牢な LNL 手法をさらに提案します。 (ほとんど) クリーンなラベル付きデータの小さなセットを選択したら、半教師あり学習の手法を適用してモデルを正確にトレーニングできます。経験的に、私たちの実験結果は、既存の LNL アルゴリズムが新しく導入された BadLabel ノイズタイプに対して脆弱である一方、私たちが提案するロバストな LNL 手法は、さまざまなタイプのラベルノイズの下でモデルの汎化パフォーマンスを効果的に向上できることを示しています。ノイズのあるラベルの新しいデータセットと堅牢な LNL アルゴリズムのソースコードは、https://github.com/zjfheart/BadLabels で入手できます。

Label-noise learning (LNL) aims to increase the model's generalization given training data with noisy labels. To facilitate practical LNL algorithms, researchers have proposed different label noise types, ranging from class-conditional to instance-dependent noises. In this paper, we introduce a novel label noise type called BadLabel, which can significantly degrade the performance of existing LNL algorithms by a large margin. BadLabel is crafted based on the label-flipping attack against standard classification, where specific samples are selected and their labels are flipped to other labels so that the loss values of clean and noisy labels become indistinguishable. To address the challenge posed by BadLabel, we further propose a robust LNL method that perturbs the labels in an adversarial manner at each epoch to make the loss values of clean and noisy labels again distinguishable. Once we select a small set of (mostly) clean labeled data, we can apply the techniques of semi-supervised learning to train the model accurately. Empirically, our experimental results demonstrate that existing LNL algorithms are vulnerable to the newly introduced BadLabel noise type, while our proposed robust LNL method can effectively improve the generalization performance of the model under various types of label noise. The new dataset of noisy labels and the source codes of robust LNL algorithms are available at https://github.com/zjfheart/BadLabels.

updated: Mon Feb 12 2024 12:06:40 GMT+0000 (UTC)

published: Sun May 28 2023 06:26:23 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト