Human Imperceptible Attacks and Applications to Improve Fairness

Xinru Hua; Huanzhong Xu; Jose Blanchet; Viet Nguyen

公平性を向上させるための人間の知覚できない攻撃とアプリケーション

現代のニューラルネットワークは、オブジェクトの分類と画像の生成を含む多くのタスクで、少なくとも人間と同じように実行できます。ただし、人間が認識できない小さな摂動は、十分に訓練されたディープニューラルネットワークのパフォーマンスを大幅に低下させる可能性があります。人間ベースの画質評価方法を統合して、人間には知覚できないが深いニューラルネットワークに重大な損傷を与える最適な攻撃を設計するDRO（Distributionly Robust Optimization）フレームワークを提供します。広範な実験を通じて、私たちの攻撃アルゴリズムは、他の最先端の人間の知覚できない攻撃方法よりも高品質の（人間には知覚されにくい）攻撃を生成することを示しています。さらに、最適に設計された人間の知覚できない攻撃を使用したDROトレーニングにより、画像分類におけるグループの公平性を向上できることを示します。最後に、DROトレーニングを大幅に高速化するためのアルゴリズムの実装を提供します。これは、独立した関心事になる可能性があります。

Modern neural networks are able to perform at least as well as humans in numerous tasks involving object classification and image generation. However, small perturbations which are imperceptible to humans may significantly degrade the performance of well-trained deep neural networks. We provide a Distributionally Robust Optimization (DRO) framework which integrates human-based image quality assessment methods to design optimal attacks that are imperceptible to humans but significantly damaging to deep neural networks. Through extensive experiments, we show that our attack algorithm generates better-quality (less perceptible to humans) attacks than other state-of-the-art human imperceptible attack methods. Moreover, we demonstrate that DRO training using our optimally designed human imperceptible attacks can improve group fairness in image classification. Towards the end, we provide an algorithmic implementation to speed up DRO training significantly, which could be of independent interest.

updated: Tue Nov 30 2021 17:54:13 GMT+0000 (UTC)

published: Tue Nov 30 2021 17:54:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト