Increasing-Margin Adversarial (IMA) Training to Improve Adversarial Robustness of Neural Networks

Linhai Ma; Liang Liang

ニューラルネットワークの敵対的ロバスト性を改善するためのマージン増加（IMA）トレーニング

ディープニューラルネットワーク（DNN）は、敵対的なノイズに対して脆弱です。敵対的ノイズをトレーニングサンプルに追加することにより、敵対的トレーニングは、敵対的ノイズに対するモデルの堅牢性を向上させることができます。ただし、過度のノイズを伴う敵対的なトレーニングサンプルは、標準の精度を損なう可能性があり、多くの医療画像分析アプリケーションでは受け入れられない可能性があります。この問題は、標準の精度と敵対的な堅牢性の間のトレードオフと呼ばれています。この論文では、訓練のための敵対的なサンプルが決定の境界に正しく配置されれば、この問題は軽減される可能性があると仮定します。この仮説に基づいて、IMAという名前の適応型敵対訓練方法を設計します。個々のトレーニングサンプルごとに、IMAは敵対的摂動の上限をサンプルごとに推定します。トレーニングプロセスでは、サンプルごとの敵対的摂動のそれぞれが、マージンに一致するように徐々に増加します。平衡状態に達すると、敵対的な摂動は増加しなくなります。 IMAは、PGDとIFGSMという2つの一般的な敵対的攻撃の下で、公開されているデータセットで評価されます。結果は次のことを示しています。（1）IMAは、DNN分類器の敵対的な堅牢性を大幅に向上させ、最先端のパフォーマンスを実現します。（2）IMAは、競合するすべての防御方法の中で、クリーン精度の低下を最小限に抑えています。（3）IMAを事前トレーニング済みモデルに適用して、時間コストを削減できます。（4）IMAは、最先端の医療画像セグメンテーションネットワークに適用でき、優れたパフォーマンスを発揮します。私たちの仕事が、敵対的な堅牢性とクリーンな精度の間のトレードオフを解消し、医療分野での堅牢なアプリケーションの開発を促進するのに役立つことを願っています。ソースコードは、この論文が公開されたときにリリースされます。

Deep neural networks (DNNs) are vulnerable to adversarial noises. By adding adversarial noises to training samples, adversarial training can improve the model's robustness against adversarial noises. However, adversarial training samples with excessive noises can harm standard accuracy, which may be unacceptable for many medical image analysis applications. This issue has been termed the trade-off between standard accuracy and adversarial robustness. In this paper, we hypothesize that this issue may be alleviated if the adversarial samples for training are placed right on the decision boundaries. Based on this hypothesis, we design an adaptive adversarial training method, named IMA. For each individual training sample, IMA makes a sample-wise estimation of the upper bound of the adversarial perturbation. In the training process, each of the sample-wise adversarial perturbations is gradually increased to match the margin. Once an equilibrium state is reached, the adversarial perturbations will stop increasing. IMA is evaluated on publicly available datasets under two popular adversarial attacks, PGD and IFGSM. The results show that: (1) IMA significantly improves adversarial robustness of DNN classifiers, which achieves the state-of-the-art performance; (2) IMA has a minimal reduction in clean accuracy among all competing defense methods; (3) IMA can be applied to pretrained models to reduce time cost; (4) IMA can be applied to the state-of-the-art medical image segmentation networks, with outstanding performance. We hope our work may help to lift the trade-off between adversarial robustness and clean accuracy and facilitate the development of robust applications in the medical field. The source code will be released when this paper is published.

updated: Wed May 18 2022 09:58:32 GMT+0000 (UTC)

published: Tue May 19 2020 00:26:52 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト