Hidden Backdoor Attack against Semantic Segmentation Models

Yiming Li; Yanjie Li; Yalei Lv; Yong Jiang; Shu-Tao Xia

セマンティックセグメンテーションモデルに対する隠れたバックドア攻撃

ディープニューラルネットワーク（DNN）は、トレーニングデータをポイズニングすることにより、隠されたバックドアをDNNに埋め込むことを目的としたバックドア攻撃に対して脆弱です。攻撃されたモデルは無害なサンプルで正常に動作しますが、非表示のバックドアがアクティブになっている場合、その予測は特定のターゲットラベルに変更されます。これまでのところ、バックドア研究は主に分類タスクに向けて行われてきました。このホワイトペーパーでは、この脅威がセマンティックセグメンテーションでも発生する可能性があることを明らかにします。これにより、多くのミッションクリティカルなアプリケーション（自動運転など）がさらに危険にさらされる可能性があります。既存の攻撃パラダイムを拡張して、画像レベルからセグメンテーションモデルを悪意を持って操作することを除いて、新しい攻撃パラダイムであるきめ細かい攻撃を提案します。このパラダイムでは、ターゲットラベル（つまり、注釈）をオブジェクトレベルからではなく処理します。より洗練された操作を実現するための画像レベル。きめ細かい攻撃によって生成された汚染されたサンプルの注釈では、特定のオブジェクトのピクセルのみが攻撃者が指定したターゲットクラスでラベル付けされ、他のオブジェクトはまだグラウンドトゥルースのものでラベル付けされます。実験は、提案された方法がトレーニングデータのごく一部のみをポイズニングすることによってセマンティックセグメンテーションモデルをうまく攻撃できることを示しています。私たちの方法は、新しい攻撃を設計するための新しい視点を提供するだけでなく、セマンティックセグメンテーション方法の堅牢性を向上させるための強力なベースラインとしても機能します。

Deep neural networks (DNNs) are vulnerable to the backdoor attack, which intends to embed hidden backdoors in DNNs by poisoning training data. The attacked model behaves normally on benign samples, whereas its prediction will be changed to a particular target label if hidden backdoors are activated. So far, backdoor research has mostly been conducted towards classification tasks. In this paper, we reveal that this threat could also happen in semantic segmentation, which may further endanger many mission-critical applications (e.g., autonomous driving). Except for extending the existing attack paradigm to maliciously manipulate the segmentation models from the image-level, we propose a novel attack paradigm, the fine-grained attack, where we treat the target label (i.e., annotation) from the object-level instead of the image-level to achieve more sophisticated manipulation. In the annotation of poisoned samples generated by the fine-grained attack, only pixels of specific objects will be labeled with the attacker-specified target class while others are still with their ground-truth ones. Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data. Our method not only provides a new perspective for designing novel attacks but also serves as a strong baseline for improving the robustness of semantic segmentation methods.

updated: Sat Apr 03 2021 05:07:33 GMT+0000 (UTC)

published: Sat Mar 06 2021 05:50:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト