Semantic Image Attack for Visual Model Diagnosis

Jinqi Luo; Zhaoning Wang; Chen Henry Wu; Dong Huang; Fernando De la Torre

視覚モデル診断のためのセマンティックイメージ攻撃

実際には、特定のトレーニングおよびテストデータセットのメトリック分析は、信頼できるまたは公平な ML モデルを保証しません。これは、バランスが取れていて、多様で、完全にラベル付けされたデータセットを取得することは、通常、費用と時間がかかり、エラーが発生しやすいという事実によるものです。 ML モデルの失敗、公平性、または堅牢性を評価するために慎重に設計されたテストセットに依存するのではなく、このホワイトペーパーでは、セマンティックイメージ攻撃 (SIA) を提案します。そして堅牢性。従来の敵対的トレーニングは、攻撃に対して ML モデルを堅牢化するための一般的な方法論です。ただし、既存の敵対的手法は、モデルの欠陥の解釈と分析を可能にする 2 つの側面 (セマンティックトレーサビリティと知覚品質) を組み合わせていません。 SIA は、定義済みのセマンティック属性空間と画像空間で反復勾配上昇を介して 2 つの機能を結合します。キーポイントの検出と分類の 3 つのシナリオで、アプローチの有効性を示します。 (1) モデル診断: SIA は、ML モデルのセマンティック脆弱性 (つまり、モデルを失敗させる属性) を強調する属性のヒストグラムを生成します。 (2) より強力な攻撃: SIA は、ベースライン手法よりも高い攻撃成功率につながる、視覚的に解釈可能な属性を持つ敵対的な例を生成します。 SIA での敵対的トレーニングは、さまざまな勾配ベースの攻撃全体で転送可能な堅牢性を向上させます。 (3) 不均衡なデータセットに対する堅牢性: SIA を使用して、過小評価されているクラスを増強します。これは、強力な増強とリバランスベースラインよりも優れています。

In practice, metric analysis on a specific train and test dataset does not guarantee reliable or fair ML models. This is partially due to the fact that obtaining a balanced, diverse, and perfectly labeled dataset is typically expensive, time-consuming, and error-prone. Rather than relying on a carefully designed test set to assess ML models' failures, fairness, or robustness, this paper proposes Semantic Image Attack (SIA), a method based on the adversarial attack that provides semantic adversarial images to allow model diagnosis, interpretability, and robustness. Traditional adversarial training is a popular methodology for robustifying ML models against attacks. However, existing adversarial methods do not combine the two aspects that enable the interpretation and analysis of the model's flaws: semantic traceability and perceptual quality. SIA combines the two features via iterative gradient ascent on a predefined semantic attribute space and the image space. We illustrate the validity of our approach in three scenarios for keypoint detection and classification. (1) Model diagnosis: SIA generates a histogram of attributes that highlights the semantic vulnerability of the ML model (i.e., attributes that make the model fail). (2) Stronger attacks: SIA generates adversarial examples with visually interpretable attributes that lead to higher attack success rates than baseline methods. The adversarial training on SIA improves the transferable robustness across different gradient-based attacks. (3) Robustness to imbalanced datasets: we use SIA to augment the underrepresented classes, which outperforms strong augmentation and re-balancing baselines.

updated: Thu Mar 23 2023 03:13:04 GMT+0000 (UTC)

published: Thu Mar 23 2023 03:13:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト