Symmetry Defense Against CNN Adversarial Perturbation Attacks

Blerta Lindqvist

CNN 敵対的摂動攻撃に対する対称性防御

この論文では、対称性を使用して、畳み込みニューラルネットワーク分類器 (CNN) を敵対的な摂動攻撃に対して堅牢にします。このような攻撃は、元の画像に摂動を加えて、自動運転車の道路標識分類器などの分類器を騙す敵対的な画像を生成します。対称性は自然界に広く浸透している側面ですが、CNN は対称性をうまく扱うことができません。たとえば、CNN は画像をその鏡像とは異なる方法で分類できます。間違ったラベル l_w で誤って分類された敵対的画像の場合、CNN が対称性を処理できないということは、対称的な敵対的画像が間違ったラベル l_w とは異なる分類を行う可能性があることを意味します。さらに、対称的な敵対的画像の分類が正しいラベルに戻ることがわかりました。攻撃者が防御に気づいていないときに画像を分類するには、画像に対称性を適用し、対称画像の分類ラベルを使用します。敵対者が防御を意識しているときに画像を分類するには、ミラー対称性とピクセル反転対称性を使用して対称グループを形成します。すべてのグループ対称性を画像に適用し、対称画像の分類ラベルのいずれか 2 つの一致に基づいて出力ラベルを決定します。適応型攻撃は、対称画像に対して矛盾する CNN 出力値を使用する損失関数に依存する必要があるため、失敗します。攻撃に関する知識がなくても、提案された対称防御は、ImageNet のデフォルトに近い精度で、勾配ベース攻撃とランダム検索攻撃の両方に対して成功します。この防御により、元の画像の分類精度も向上します。

This paper uses symmetry to make Convolutional Neural Network classifiers (CNNs) robust against adversarial perturbation attacks. Such attacks add perturbation to original images to generate adversarial images that fool classifiers such as road sign classifiers of autonomous vehicles. Although symmetry is a pervasive aspect of the natural world, CNNs are unable to handle symmetry well. For example, a CNN can classify an image differently from its mirror image. For an adversarial image that misclassifies with a wrong label l_w, CNN inability to handle symmetry means that a symmetric adversarial image can classify differently from the wrong label l_w. Further than that, we find that the classification of a symmetric adversarial image reverts to the correct label. To classify an image when adversaries are unaware of the defense, we apply symmetry to the image and use the classification label of the symmetric image. To classify an image when adversaries are aware of the defense, we use mirror symmetry and pixel inversion symmetry to form a symmetry group. We apply all the group symmetries to the image and decide on the output label based on the agreement of any two of the classification labels of the symmetry images. Adaptive attacks fail because they need to rely on loss functions that use conflicting CNN output values for symmetric images. Without attack knowledge, the proposed symmetry defense succeeds against both gradient-based and random-search attacks, with up to near-default accuracies for ImageNet. The defense even improves the classification accuracy of original images.

updated: Thu Aug 10 2023 12:42:06 GMT+0000 (UTC)

published: Sat Oct 08 2022 18:49:58 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト