Backdoor Attacks Against Deep Learning Systems in the Physical World

Emily Wenger; Josephine Passananti; Arjun Bhagoji; Yuanshun Yao; Haitao Zheng; Ben Y. Zhao

物理世界のディープラーニングシステムに対するバックドア攻撃

バックドア攻撃は、隠れた悪意のある動作を深層学習モデルに埋め込みます。これは、特定のトリガーを含むモデル入力でのみアクティブ化され、誤分類を引き起こします。ただし、バックドア攻撃と防御に関する既存の作業は、デジタルで生成されたパターンをトリガーとして使用するデジタル攻撃に主に焦点を当てています。重要な質問は未解決のままです。バックドア攻撃は、物理オブジェクトをトリガーとして使用して成功し、現実世界のディープラーニングシステムに対する信頼できる脅威となる可能性がありますか？重要な深層学習タスクである顔認識のためにこの質問を調査するために、詳細な実証研究を実施します。 7つの物理オブジェクトをトリガーとして使用して、10人のボランティアの3205画像のカスタムデータセットを収集し、それを使用して、さまざまな現実の条件下での物理的なバックドア攻撃の実現可能性を調査します。私たちの研究は、2つの重要な発見を明らかにしています。まず、物理的なバックドア攻撃は、物理的なオブジェクトによって課せられる制約を克服するように注意深く構成されている場合、非常に成功する可能性があります。特に、成功するトリガーの配置は、ターゲットモデルが主要な顔の特徴に依存していることによって大きく制約されます。第2に、（デジタル）バックドアに対する今日の最先端の防御のうち4つは、物理的なバックドアに対して効果がありません。これは、物理的なオブジェクトを使用すると、これらの防御を構築するために使用される主要な仮定が破られるためです。私たちの調査では、（物理的な）バックドア攻撃は架空の現象ではなく、重要な分類タスクに深刻な現実の脅威をもたらすことが確認されています。現実の世界では、バックドアに対する新しくより堅牢な防御が必要です。

Backdoor attacks embed hidden malicious behaviors into deep learning models, which only activate and cause misclassifications on model inputs containing a specific trigger. Existing works on backdoor attacks and defenses, however, mostly focus on digital attacks that use digitally generated patterns as triggers. A critical question remains unanswered: can backdoor attacks succeed using physical objects as triggers, thus making them a credible threat against deep learning systems in the real world? We conduct a detailed empirical study to explore this question for facial recognition, a critical deep learning task. Using seven physical objects as triggers, we collect a custom dataset of 3205 images of ten volunteers and use it to study the feasibility of physical backdoor attacks under a variety of real-world conditions. Our study reveals two key findings. First, physical backdoor attacks can be highly successful if they are carefully configured to overcome the constraints imposed by physical objects. In particular, the placement of successful triggers is largely constrained by the target model's dependence on key facial features. Second, four of today's state-of-the-art defenses against (digital) backdoors are ineffective against physical backdoors, because the use of physical objects breaks core assumptions used to construct these defenses. Our study confirms that (physical) backdoor attacks are not a hypothetical phenomenon but rather pose a serious real-world threat to critical classification tasks. We need new and more robust defenses against backdoors in the physical world.

updated: Wed Apr 14 2021 16:41:55 GMT+0000 (UTC)

published: Thu Jun 25 2020 17:26:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト