Machine Learning Security against Data Poisoning: Are We There Yet?

Antonio Emanuele Cinà; Kathrin Grosse; Ambra Demontis; Battista Biggio; Fabio Roli; Marcello Pelillo

データ中毒に対する機械学習のセキュリティ：私たちはまだそこにいますか？

機械学習の最近の成功は、多くの異なるアプリケーションでのコンピューティング能力と大量のデータの可用性の向上によって促進されています。ただし、そのようなデータが悪意を持って操作されて学習プロセスを誤解させると、結果として得られるモデルの信頼性が損なわれる可能性があります。この記事では、最初に、機械学習モデルの学習に使用されるトレーニングデータを危険にさらす中毒攻撃を確認します。これには、全体的なパフォーマンスの低下、特定のテストサンプルの予測の操作、さらにはモデルへのバックドアの埋め込みを目的とした攻撃が含まれます。次に、モデルトレーニングの前、最中、および後に、これらの攻撃を軽減する方法について説明します。データ中毒攻撃に対する機械学習モデルの信頼性を評価および改善するのに適したテスト方法とベンチマークの開発を妨げている、いくつかの関連する未解決の課題を定式化することで、記事を締めくくります。

The recent success of machine learning has been fueled by the increasing availability of computing power and large amounts of data in many different applications. However, the trustworthiness of the resulting models can be compromised when such data is maliciously manipulated to mislead the learning process. In this article, we first review poisoning attacks that compromise the training data used to learn machine-learning models, including attacks that aim to reduce the overall performance, manipulate the predictions on specific test samples, and even implant backdoors in the model. We then discuss how to mitigate these attacks before, during, and after model training. We conclude our article by formulating some relevant open challenges which are hindering the development of testing methods and benchmarks suitable for assessing and improving the trustworthiness of machine-learning models against data poisoning attacks.

updated: Tue Apr 12 2022 17:52:09 GMT+0000 (UTC)

published: Tue Apr 12 2022 17:52:09 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト