Universal Detection of Backdoor Attacks via Density-based Clustering and Centroids Analysis

Wei Guo; Benedetta Tondi; Mauro Barni

密度ベースのクラスタリングとセントロイド分析によるバックドア攻撃の普遍的な検出

この論文では、バックドア攻撃に対するクラスタリングと重心分析 (CCA-UD) に基づくユニバーサル防御を提案します。提案された防御の目標は、トレーニングデータセットを検査することにより、ディープニューラルネットワークモデルがバックドア攻撃を受けているかどうかを明らかにすることです。 CCA-UD はまず、密度ベースのクラスタリングによってトレーニングセットのサンプルをクラスタリングします。次に、汚染されたクラスターの存在を検出するための新しい戦略を適用します。提案された戦略は、分析されたクラスターの代表的な例の特徴が良性のサンプルに追加されたときに得られる一般的な誤分類動作に基づいています。誤分類エラーを誘発する能力は、汚染されたサンプルの一般的な特徴であるため、提案された防御は攻撃に依存しません。これにより、既存の防御との大きな違いが隠されています。たとえば、攻撃者が汚染されたサンプルのラベルを破損した場合など、一部のタイプのバックドア攻撃に対してのみ防御できるか、またはによって採用された汚染率に関するいくつかの条件が満たされた場合にのみ有効です。攻撃者または攻撃者が使用するトリガーパターンの種類が満たされます。さまざまなタイプのバックドア攻撃と、ローカルトリガーとグローバルトリガーの両方を含むトリガーパターンを考慮して、いくつかの分類タスクで実行された実験では、提案された方法がすべてのケースでバックドア攻撃を防御するのに非常に効果的であり、常に最新技術よりも優れていることが明らかになりました。技術。

In this paper, we propose a Universal Defence based on Clustering and Centroids Analysis (CCA-UD) against backdoor attacks. The goal of the proposed defence is to reveal whether a Deep Neural Network model is subject to a backdoor attack by inspecting the training dataset. CCA-UD first clusters the samples of the training set by means of density-based clustering. Then, it applies a novel strategy to detect the presence of poisoned clusters. The proposed strategy is based on a general misclassification behaviour obtained when the features of a representative example of the analysed cluster are added to benign samples. The capability of inducing a misclassification error is a general characteristic of poisoned samples, hence the proposed defence is attack-agnostic. This mask a significant difference with respect to existing defences, that, either can defend against only some types of backdoor attacks, e.g., when the attacker corrupts the label of the poisoned samples, or are effective only when some conditions on the poisoning ratios adopted by the attacker or the kind of triggering pattern used by the attacker are satisfied. Experiments carried out on several classification tasks, considering different types of backdoor attacks and triggering patterns, including both local and global triggers, reveal that the proposed method is very effective to defend against backdoor attacks in all the cases, always outperforming the state of the art techniques.

updated: Wed Jan 11 2023 16:31:38 GMT+0000 (UTC)

published: Wed Jan 11 2023 16:31:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト