Evolving Architectures with Gradient Misalignment toward Low Adversarial Transferability

Kevin Richard G. Operiano; Wanchalerm Pora; Hitoshi Iba; Hiroshi Kera

低い敵対的移転可能性に向けた段階的なミスアライメントを伴う進化するアーキテクチャ

ディープニューラルネットワーク画像分類器は、それらのために作成された敵対的な例だけでなく、他の人のために作成されたものにも影響を受けやすいことが知られています。この現象は、画像分類子に依存するさまざまなブラックボックスシステムで潜在的なセキュリティリスクをもたらします。敵対的な例のそのような転送可能性の背後にある理由はまだ完全には理解されておらず、多くの研究が転送可能性の低い分類器を取得するためのトレーニング方法を提案しています。この研究では、転送可能性へのネットワークアーキテクチャの寄与を調査することにより、新しい観点からこの問題に対処します。具体的には、ネットワークアーキテクチャを進化させるためにニューロエボリューションを使用し、トレーニング後にネットワークが異なる機能に収束するように促す勾配ミスアラインメント損失を使用するアーキテクチャ検索フレームワークを提案します。私たちの実験は、提案されたフレームワークが、摂動されていない画像で良好な精度を維持しながら、ResNetとVGGを含む4つの標準ネットワークからの転送可能性を低下させるアーキテクチャをうまく発見することを示しています。さらに、勾配ミスアラインメントでトレーニングされた進化したネットワークは、勾配ミスアラインメントでトレーニングされた標準ネットワークと比較して大幅に低い転送可能性を示します。これは、ネットワークアーキテクチャが転送可能性の削減に重要な役割を果たしていることを示しています。この研究は、適切なネットワークアーキテクチャを設計または探索することが、転送可能性の問題に取り組み、敵対的に堅牢な画像分類器をトレーニングするための有望なアプローチであることを示しています。

Deep neural network image classifiers are known to be susceptible not only to adversarial examples created for them but even those created for others. This phenomenon poses a potential security risk in various black-box systems relying on image classifiers. The reason behind such transferability of adversarial examples is not yet fully understood and many studies have proposed training methods to obtain classifiers with low transferability. In this study, we address this problem from a novel perspective through investigating the contribution of the network architecture to transferability. Specifically, we propose an architecture searching framework that employs neuroevolution to evolve network architectures and the gradient misalignment loss to encourage networks to converge into dissimilar functions after training. Our experiments show that the proposed framework successfully discovers architectures that reduce transferability from four standard networks including ResNet and VGG, while maintaining a good accuracy on unperturbed images. In addition, the evolved networks trained with gradient misalignment exhibit significantly lower transferability compared to standard networks trained with gradient misalignment, which indicates that the network architecture plays an important role in reducing transferability. This study demonstrates that designing or exploring proper network architectures is a promising approach to tackle the transferability issue and train adversarially robust image classifiers.

updated: Mon Sep 13 2021 12:41:53 GMT+0000 (UTC)

published: Mon Sep 13 2021 12:41:53 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト