Improving the Transferability of Adversarial Attacks on Face Recognition with Beneficial Perturbation Feature Augmentation

Fengfan Zhou; Hefei Ling; Yuxuan Shi; Jiazhong Chen; Zongyi Li; Ping Li

有益な摂動機能拡張による顔認識に対する敵対的攻撃の伝達可能性の改善

顔認識 (FR) モデルは、無害な顔画像に知覚できない摂動を追加することによって作成された敵対的な例によって簡単にだまされる可能性があります。敵対的な顔の例の存在は、社会の安全に大きな脅威をもたらします。より持続可能なデジタル国家を構築するために、この論文では、敵対的な顔の例の転送可能性を改善して、既存の FR モデルの盲点をさらに明らかにします。ハードサンプルの生成は、トレーニングタスクにおけるモデルの一般化の改善に有効であることが示されていますが、このアイデアを利用して敵対的な顔の例の転送可能性を改善する有効性は未調査のままです.この目的のために、ハードサンプルの特性と、トレーニングタスクと敵対的攻撃タスク間の対称性に基づいて、敵対的攻撃タスクのハードサンプルと同様の効果を持つハードモデルの概念を提案します。ハードモデルの概念を利用して、Beneficial Perturbation Feature Augmentation Attack (BPFA) と呼ばれる新しい攻撃方法を提案します。これは、新しいハードモデルを絶えず生成して敵対的な例を作成することにより、FR モデルを代理するための敵対的な例の過剰適合を減らします。具体的には、バックプロパゲーションでは、BPFA は事前に選択された特徴マップの勾配を記録し、入力画像の勾配を使用して敵対的な例を作成します。次の順伝播では、BPFA は記録された勾配を活用して、対応する特徴マップに有益な摂動を追加し、損失を増やします。広範な実験により、BPFA が FR に対する敵対的攻撃の転送可能性を大幅に高めることができることが示されています。

Face recognition (FR) models can be easily fooled by adversarial examples, which are crafted by adding imperceptible perturbations on benign face images. The existence of adversarial face examples poses a great threat to the security of society. In order to build a more sustainable digital nation, in this paper, we improve the transferability of adversarial face examples to expose more blind spots of existing FR models. Though generating hard samples has shown its effectiveness in improving the generalization of models in training tasks, the effectiveness of utilizing this idea to improve the transferability of adversarial face examples remains unexplored. To this end, based on the property of hard samples and the symmetry between training tasks and adversarial attack tasks, we propose the concept of hard models, which have similar effects as hard samples for adversarial attack tasks. Utilizing the concept of hard models, we propose a novel attack method called Beneficial Perturbation Feature Augmentation Attack (BPFA), which reduces the overfitting of adversarial examples to surrogate FR models by constantly generating new hard models to craft the adversarial examples. Specifically, in the backpropagation, BPFA records the gradients on pre-selected feature maps and uses the gradient on the input image to craft the adversarial example. In the next forward propagation, BPFA leverages the recorded gradients to add beneficial perturbations on their corresponding feature maps to increase the loss. Extensive experiments demonstrate that BPFA can significantly boost the transferability of adversarial attacks on FR.

updated: Wed Mar 29 2023 15:35:55 GMT+0000 (UTC)

published: Fri Oct 28 2022 13:25:59 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト