We augment adversarial training (AT) with worst case adversarial training (WCAT) which improves adversarial robustness by 11% over the current state-of-the-art result in the ℓ_2 norm on CIFAR-10. We obtain verifiable average case and worst case robustness guarantees, based on the expected and maximum values of the norm of the gradient of the loss. We interpret adversarial training as Total Variation Regularization, which is a fundamental tool in mathematical image processing, and WCAT as Lipschitz regularization.
updated: Fri Sep 13 2019 14:56:57 GMT+0000 (UTC)
published: Mon Oct 01 2018 20:02:00 GMT+0000 (UTC)