Robustifying Deep Vision Models Through Shape Sensitization

Aditay Tripathi; Rishubh Singh; Anirban Chakraborty; Pradeep Shenoy

形状感応によるディープビジョンモデルのロバスト化

最近の研究では、ディープビジョンモデルは低レベルまたは「テクスチャ」機能に過度に依存する傾向があり、一般化が不十分になることが示されています。 DNN におけるこのいわゆるテクスチャバイアスを克服するために、さまざまなデータ拡張戦略が提案されています。オブジェクト分類設定で正確な予測のために全体的な形状を学習するようネットワークに明示的にインセンティブを与える、シンプルで軽量な敵対的増強手法を提案します。私たちの拡張は、ランダムに決定された混合比率を使用して、エッジマップ画像の画像ラベルを使用して、シャッフルされたパッチを使用して、ある画像のエッジマップを別の画像に重ね合わせます。これらの拡張された画像を分類するために、モデルはエッジを検出して焦点を当てるだけでなく、関連するエッジと偽のエッジを区別する必要があります。拡張により、さまざまなデータセットとニューラルアーキテクチャの分類精度とロバスト性が大幅に向上することを示します。例として、ViT-S の場合、最大 6% の分類精度の絶対的な向上が得られます。また、ImageNet-A (ViT-B の場合) や ImageNet-R (ViT-S の場合) などの自然な敵対的および分布外のデータセットで、それぞれ最大 28% および 8.5% のゲインを得ています。さまざまなプローブデータセットを使用した分析では、トレーニング済みモデルの形状感度が大幅に向上し、堅牢性と分類精度の改善が観察されたことが説明されています。

Recent work has shown that deep vision models tend to be overly dependent on low-level or "texture" features, leading to poor generalization. Various data augmentation strategies have been proposed to overcome this so-called texture bias in DNNs. We propose a simple, lightweight adversarial augmentation technique that explicitly incentivizes the network to learn holistic shapes for accurate prediction in an object classification setting. Our augmentations superpose edgemaps from one image onto another image with shuffled patches, using a randomly determined mixing proportion, with the image label of the edgemap image. To classify these augmented images, the model needs to not only detect and focus on edges but distinguish between relevant and spurious edges. We show that our augmentations significantly improve classification accuracy and robustness measures on a range of datasets and neural architectures. As an example, for ViT-S, We obtain absolute gains on classification accuracy gains up to 6%. We also obtain gains of up to 28% and 8.5% on natural adversarial and out-of-distribution datasets like ImageNet-A (for ViT-B) and ImageNet-R (for ViT-S), respectively. Analysis using a range of probe datasets shows substantially increased shape sensitivity in our trained models, explaining the observed improvement in robustness and classification accuracy.

updated: Mon Nov 14 2022 11:17:46 GMT+0000 (UTC)

published: Mon Nov 14 2022 11:17:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト