Adversarially Robust Distillation

Micah Goldblum; Liam Fowl; Soheil Feizi; Tom Goldstein

敵対的に堅牢な蒸留

知識の蒸留は、分類用の小規模で高性能なニューラルネットワークの作成に効果的ですが、これらの小規模ネットワークは敵対攻撃に対して脆弱です。このペーパーでは、知識の蒸留中に敵対者の堅牢性が教師から生徒にどのように移行するかを研究します。きれいな画像のみで蒸留した場合でも、学生は大量の堅牢性を継承することがあります。第二に、学生ネットワークに堅牢性を蒸留するための敵対的堅牢蒸留（ARD）を導入します。従来の蒸留のような高いテスト精度で小さなモデルを作成することに加えて、ARDは大規模なネットワークの優れた堅牢性を学生に渡します。私たちの実験では、ARD学生モデルは、堅牢な精度の点で同一のアーキテクチャの敵対的に訓練されたネットワークよりも決定的に優れており、標準の堅牢性ベンチマークで最先端の方法を上回っています。最後に、最近の高速の敵対訓練方法をARDに適応させて、堅牢な蒸留を促進します。

Knowledge distillation is effective for producing small, high-performance neural networks for classification, but these small networks are vulnerable to adversarial attacks. This paper studies how adversarial robustness transfers from teacher to student during knowledge distillation. We find that a large amount of robustness may be inherited by the student even when distilled on only clean images. Second, we introduce Adversarially Robust Distillation (ARD) for distilling robustness onto student networks. In addition to producing small models with high test accuracy like conventional distillation, ARD also passes the superior robustness of large networks onto the student. In our experiments, we find that ARD student models decisively outperform adversarially trained networks of identical architecture in terms of robust accuracy, surpassing state-of-the-art methods on standard robustness benchmarks. Finally, we adapt recent fast adversarial training methods to ARD for accelerated robust distillation.

updated: Mon Dec 02 2019 22:37:09 GMT+0000 (UTC)

published: Thu May 23 2019 16:09:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト