On Practical Aspects of Aggregation Defenses against Data Poisoning Attacks

Wenxiao Wang; Soheil Feizi

データポイズニング攻撃に対するアグリゲーション防御の実践的側面について

データへのアクセスの増加は、悪意のあるトレーニングサンプルを使用して深層学習モデルの動作を操作できるため、深層学習において機会とリスクの両方をもたらします。このような攻撃はデータポイズニングとして知られています。データポイズニングに対する防御戦略の最近の進歩により、認定されたポイズニングの堅牢性において最先端の結果を達成するための集約スキームの有効性が浮き彫りになりました。ただし、これらのアプローチの実際的な意味はまだ不明です。ここでは、代表的なアグリゲーション防御であるディープパーティションアグリゲーションに焦点を当て、効率、パフォーマンス、堅牢性などの実用的な側面を評価します。評価には、64 x 64 の解像度にリサイズされた ImageNet を使用し、これまでよりも大規模な評価が可能です。まず、集約防御のためのトレーニングと推論の効率を向上させる、基本モデルをスケーリングするためのシンプルかつ実用的なアプローチを示します。次に、精度を維持しながらデプロイできる基本モデルの最大数の実際的な推定値として、データ対複雑さの比率、つまりデータセットのサイズとサンプルの複雑さの比率を裏付ける経験的証拠を提供します。最後に重要なことですが、アグリゲーションの防御がポイズニングの過剰適合現象を通じて経験的にどのようにポイズニングの堅牢性を高めるかを指摘します。これは、アグリゲーションの経験的なポイズニングの堅牢性の基礎となる重要なメカニズムです。全体として、私たちの調査結果は、データポイズニングの脅威を軽減するためのアグリゲーション防御の実践的な実装に貴重な洞察を提供します。

The increasing access to data poses both opportunities and risks in deep learning, as one can manipulate the behaviors of deep learning models with malicious training samples. Such attacks are known as data poisoning. Recent advances in defense strategies against data poisoning have highlighted the effectiveness of aggregation schemes in achieving state-of-the-art results in certified poisoning robustness. However, the practical implications of these approaches remain unclear. Here we focus on Deep Partition Aggregation, a representative aggregation defense, and assess its practical aspects, including efficiency, performance, and robustness. For evaluations, we use ImageNet resized to a resolution of 64 by 64 to enable evaluations at a larger scale than previous ones. Firstly, we demonstrate a simple yet practical approach to scaling base models, which improves the efficiency of training and inference for aggregation defenses. Secondly, we provide empirical evidence supporting the data-to-complexity ratio, i.e. the ratio between the data set size and sample complexity, as a practical estimation of the maximum number of base models that can be deployed while preserving accuracy. Last but not least, we point out how aggregation defenses boost poisoning robustness empirically through the poisoning overfitting phenomenon, which is the key underlying mechanism for the empirical poisoning robustness of aggregations. Overall, our findings provide valuable insights for practical implementations of aggregation defenses to mitigate the threat of data poisoning.

updated: Wed Jun 28 2023 17:59:35 GMT+0000 (UTC)

published: Wed Jun 28 2023 17:59:35 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト