Poisoning the Unlabeled Dataset of Semi-Supervised Learning

Nicholas Carlini

半教師あり学習のラベルなしデータセットのポイズニング

半教師あり機械学習モデルは、（小さな）ラベル付きトレーニング例のセットと（大きな）ラベルなしトレーニング例のセットから学習します。最先端のモデルは、完全に監視されたトレーニングの数パーセント以内に到達できますが、必要なラベル付きデータは100分の1になります。新しいクラスの脆弱性を調査します。ラベルのないデータセットを変更するポイズニング攻撃です。有用であるために、ラベルのないデータセットは、ラベルの付いたデータセットよりも厳密に少ないレビューしか与えられないため、攻撃者はそれらを簡単に汚染する可能性があります。データセットサイズのわずか0.1％の悪意を持って作成されたラベルなしの例を挿入することにより、このポイズニングされたデータセットでトレーニングされたモデルを操作して、テスト時に任意の例を（任意のラベルとして）誤って分類できます。私たちの攻撃は、データセットと半教師あり学習方法全体で非常に効果的です。より正確な方法（したがって、使用される可能性が高い）は、中毒攻撃に対して非常に脆弱であり、そのため、より優れたトレーニング方法がこの攻撃を防ぐ可能性は低いことがわかりました。これに対抗するために、防御の空間を探索し、攻撃を軽減する2つの方法を提案します。

Semi-supervised machine learning models learn from a (small) set of labeled training examples, and a (large) set of unlabeled training examples. State-of-the-art models can reach within a few percentage points of fully-supervised training, while requiring 100x less labeled data. We study a new class of vulnerabilities: poisoning attacks that modify the unlabeled dataset. In order to be useful, unlabeled datasets are given strictly less review than labeled datasets, and adversaries can therefore poison them easily. By inserting maliciously-crafted unlabeled examples totaling just 0.1% of the dataset size, we can manipulate a model trained on this poisoned dataset to misclassify arbitrary examples at test time (as any desired label). Our attacks are highly effective across datasets and semi-supervised learning methods. We find that more accurate methods (thus more likely to be used) are significantly more vulnerable to poisoning attacks, and as such better training methods are unlikely to prevent this attack. To counter this we explore the space of defenses, and propose two methods that mitigate our attack.

updated: Tue Aug 10 2021 07:38:44 GMT+0000 (UTC)

published: Tue May 04 2021 16:55:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト