Feature Perturbation Augmentation for Reliable Evaluation of Importance Estimators

Lennart Brocki; Neo Christopher Chung

重要度推定量の信頼できる評価のための特徴摂動増強

事後説明手法は、ディープニューラルネットワークの内部動作をより解釈しやすくしようとします。ただし、グラウンドトゥルースは一般的に不足しているため、重要度スコアを入力フィーチャに割り当てるローカル事後解釈可能性メソッドの評価は困難です。最も一般的な評価フレームワークの 1 つは、解釈可能性メソッドによって重要と見なされる機能を摂動させ、予測精度の変化を測定することです。直観的には、予測精度の大幅な低下は、説明が予測結果 (たとえば、ロジット) に関する特徴の重要性を正しく定量化したことを示します。ただし、予測結果の変化は摂動アーティファクトに起因する可能性があります。これは、テストデータセットの摂動サンプルがトレーニングデータセットと比較して分布外 (OOD) であり、予期しない方法でモデルを乱す可能性があるためです。この課題を克服するために、モデルのトレーニング中に摂動画像を作成および追加する機能摂動拡張 (FPA) を提案します。大規模な計算実験を通じて、FPA が深層ニューラルネットワーク (DNN) を摂動に対してより堅牢にすることを実証します。さらに、FPA を使用した DNN のトレーニングは、重要度スコアの符号がモデルを以前に想定されていたよりも有意義に説明できることを示しています。全体として、FPA は直感的なデータ拡張手法であり、事後の解釈可能性の評価を改善します。

Post-hoc explanation methods attempt to make the inner workings of deep neural networks more interpretable. However, since a ground truth is in general lacking, local post-hoc interpretability methods, which assign importance scores to input features, are challenging to evaluate. One of the most popular evaluation frameworks is to perturb features deemed important by an interpretability method and to measure the change in prediction accuracy. Intuitively, a large decrease in prediction accuracy would indicate that the explanation has correctly quantified the importance of features with respect to the prediction outcome (e.g., logits). However, the change in the prediction outcome may stem from perturbation artifacts, since perturbed samples in the test dataset are out of distribution (OOD) compared to the training dataset and can therefore potentially disturb the model in an unexpected manner. To overcome this challenge, we propose feature perturbation augmentation (FPA) which creates and adds perturbed images during the model training. Through extensive computational experiments, we demonstrate that FPA makes deep neural networks (DNNs) more robust against perturbations. Furthermore, training DNNs with FPA demonstrate that the sign of importance scores may explain the model more meaningfully than has previously been assumed. Overall, FPA is an intuitive data augmentation technique that improves the evaluation of post-hoc interpretability methods.

updated: Thu Mar 02 2023 19:05:46 GMT+0000 (UTC)

published: Thu Mar 02 2023 19:05:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト