Visual Analytics of Neuron Vulnerability to Adversarial Attacks on Convolutional Neural Networks

Yiran Li; Junpeng Wang; Takanori Fujiwara; Kwan-Liu Ma

畳み込みニューラルネットワークに対する敵対的攻撃に対するニューロンの脆弱性のビジュアル分析

畳み込みニューラルネットワーク (CNN) に対する敵対的攻撃 (人間が知覚できない摂動を入力画像に注入する) は、高性能 CNN をだまして誤った予測をさせる可能性があります。敵対的攻撃の成功は、CNN の堅牢性に関する重大な懸念を引き起こし、医療診断や自動運転などの安全性が重要なアプリケーションでの使用を妨げています。私たちの研究では、(1) どのニューロンが攻撃に対してより脆弱であるか、(2) これらの脆弱なニューロンが予測中にどの画像特徴をキャプチャするかという 2 つの質問に答えることで、敵対的攻撃を理解するためのビジュアル分析アプローチを導入しています。最初の質問では、複数の摂動ベースの測定を導入して、攻撃の大きさを個々の CNN ニューロンに分解し、脆弱性レベルによってニューロンをランク付けします。 2 つ目は、ユーザーが選択したニューロンを強く刺激してニューロンの責任を増強および検証する画像の特徴 (猫の耳など) を特定することです。さらに、予測におけるニューロンの役割に基づく階層的クラスタリングを支援することにより、多数のニューロンのインタラクティブな探索をサポートします。この目的のために、ビジュアル分析システムは、敵対的攻撃を解釈するための視覚的な推論を組み込むように設計されています。複数のケーススタディとドメインエキスパートからのフィードバックを通じて、システムの有効性を検証します。

Adversarial attacks on a convolutional neural network (CNN) -- injecting human-imperceptible perturbations into an input image -- could fool a high-performance CNN into making incorrect predictions. The success of adversarial attacks raises serious concerns about the robustness of CNNs, and prevents them from being used in safety-critical applications, such as medical diagnosis and autonomous driving. Our work introduces a visual analytics approach to understanding adversarial attacks by answering two questions: (1) which neurons are more vulnerable to attacks and (2) which image features do these vulnerable neurons capture during the prediction? For the first question, we introduce multiple perturbation-based measures to break down the attacking magnitude into individual CNN neurons and rank the neurons by their vulnerability levels. For the second, we identify image features (e.g., cat ears) that highly stimulate a user-selected neuron to augment and validate the neuron's responsibility. Furthermore, we support an interactive exploration of a large number of neurons by aiding with hierarchical clustering based on the neurons' roles in the prediction. To this end, a visual analytics system is designed to incorporate visual reasoning for interpreting adversarial attacks. We validate the effectiveness of our system through multiple case studies as well as feedback from domain experts.

updated: Mon Mar 06 2023 01:01:56 GMT+0000 (UTC)

published: Mon Mar 06 2023 01:01:56 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト