Efficient Certified Defenses Against Patch Attacks on Image Classifiers

Jan Hendrik Metzen; Maksym Yatsura

画像分類器へのパッチ攻撃に対する効率的な認定防御

敵対的なパッチは、知覚コンポーネントを介した自律システムへの物理的な世界攻撃の現実的な脅威モデルをもたらします。したがって、自動運転などのセーフティクリティカルドメインの自律システムには、クリーンな入力で高いパフォーマンスを維持しながら、パッチに対する認証可能な堅牢性と効率的な推論を組み合わせたフェイルセーフフォールバックコンポーネントが含まれている必要があります。効率的な認証を可能にするモデルアーキテクチャと認証手順の新しい組み合わせであるBagCertを提案します。さまざまなサイズと場所のパッチに対する認定された堅牢性のエンドツーエンドの最適化を可能にする損失を導き出します。 CIFAR10では、BagCertは単一のGPUで43秒で10.000の例を認定し、5x5パッチに対して86％のクリーンと60％の認定精度を取得します。

Adversarial patches pose a realistic threat model for physical world attacks on autonomous systems via their perception component. Autonomous systems in safety-critical domains such as automated driving should thus contain a fail-safe fallback component that combines certifiable robustness against patches with efficient inference while maintaining high performance on clean inputs. We propose BagCert, a novel combination of model architecture and certification procedure that allows efficient certification. We derive a loss that enables end-to-end optimization of certified robustness against patches of different sizes and locations. On CIFAR10, BagCert certifies 10.000 examples in 43 seconds on a single GPU and obtains 86% clean and 60% certified accuracy against 5x5 patches.

updated: Mon Feb 08 2021 12:11:41 GMT+0000 (UTC)

published: Mon Feb 08 2021 12:11:41 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト