This paper is concerned with the defense of deep models against adversarial attacks. Inspired by the certificate defense approach, we propose a maximal adversarial distortion (MAD) optimization method for robustifying deep networks. MAD captures the idea of increasing separability of class clusters in the embedding space while decreasing the network sensitivity to small distortions. Given a deep neural network (DNN) for a classification problem, an application of MAD optimization results in MadNet, a version of the original network, now equipped with an adversarial defense mechanism. MAD optimization is intuitive, effective and scalable, and the resulting MadNet can improve the original accuracy. We present an extensive empirical study demonstrating that MadNet improves adversarial robustness performance compared to state-of-the-art methods.
updated: Fri Jun 12 2020 19:05:36 GMT+0000 (UTC)
published: Sun Nov 03 2019 11:21:35 GMT+0000 (UTC)