Increasing-Margin Adversarial (IMA) Training to Improve Adversarial Robustness of Neural Networks

Linhai Ma; Liang Liang

ニューラルネットワークの敵対的ロバスト性を改善するための増加マージン敵対的 (IMA) トレーニング

ディープニューラルネットワーク (DNN) は、敵対的ノイズに対して脆弱です。敵対的ノイズをトレーニングサンプルに追加することにより、敵対的トレーニングは、敵対的ノイズに対するモデルの堅牢性を向上させることができます。ただし、過剰なノイズを含む敵対的トレーニングサンプルは標準精度を損なう可能性があり、多くの医療画像分析アプリケーションでは受け入れられない可能性があります。この問題は、標準精度と敵対的堅牢性のトレードオフと呼ばれています。この論文では、トレーニング用の敵対的サンプルが決定境界上に配置されている場合、この問題が軽減される可能性があるという仮説を立てています。この仮説に基づいて、IMA と呼ばれる適応型の敵対的トレーニング方法を設計します。個々のトレーニングサンプルごとに、IMA は敵対的摂動の上限をサンプルごとに推定します。トレーニングプロセスでは、マージンに一致するように、サンプルごとの敵対的摂動のそれぞれが徐々に増加します。平衡状態に達すると、敵対的摂動は増加しなくなります。 IMA は、2 つの一般的な敵対的攻撃である PGD と IFGSM の下で公開されているデータセットで評価されます。結果は次のことを示しています。(1) IMA は DNN 分類器の敵対的ロバスト性を大幅に改善し、最先端のパフォーマンスを実現します。 (2) IMA は、競合するすべての防御方法の中でクリーン精度の低下が最小限です。 (3) 事前トレーニング済みのモデルに IMA を適用して、時間コストを削減できます。 (4) IMA は、最先端の医用画像セグメンテーションネットワークに適用でき、優れたパフォーマンスを発揮します。私たちの研究が、敵対的な堅牢性とクリーンな精度の間のトレードオフを解消し、医療分野での堅牢なアプリケーションの開発を促進するのに役立つことを願っています.ソースコードは、この論文の発行時に公開されます。

Deep neural networks (DNNs) are vulnerable to adversarial noises. By adding adversarial noises to training samples, adversarial training can improve the model's robustness against adversarial noises. However, adversarial training samples with excessive noises can harm standard accuracy, which may be unacceptable for many medical image analysis applications. This issue has been termed the trade-off between standard accuracy and adversarial robustness. In this paper, we hypothesize that this issue may be alleviated if the adversarial samples for training are placed right on the decision boundaries. Based on this hypothesis, we design an adaptive adversarial training method, named IMA. For each individual training sample, IMA makes a sample-wise estimation of the upper bound of the adversarial perturbation. In the training process, each of the sample-wise adversarial perturbations is gradually increased to match the margin. Once an equilibrium state is reached, the adversarial perturbations will stop increasing. IMA is evaluated on publicly available datasets under two popular adversarial attacks, PGD and IFGSM. The results show that: (1) IMA significantly improves adversarial robustness of DNN classifiers, which achieves state-of-the-art performance; (2) IMA has a minimal reduction in clean accuracy among all competing defense methods; (3) IMA can be applied to pretrained models to reduce time cost; (4) IMA can be applied to the state-of-the-art medical image segmentation networks, with outstanding performance. We hope our work may help to lift the trade-off between adversarial robustness and clean accuracy and facilitate the development of robust applications in the medical field. The source code will be released when this paper is published.

updated: Sun Aug 21 2022 17:53:33 GMT+0000 (UTC)

published: Tue May 19 2020 00:26:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト