Attribution of Predictive Uncertainties in Classification Models

Iker Perez; Piotr Skalski; Alec Barns-Graham; Jason Wong; David Sutton

分類モデルにおける予測の不確実性の帰属

分類タスクの予測の不確実性は、多くの場合、モデルの不備または不十分なトレーニングデータの結果です。画像処理などの一般的なアプリケーションでは、これらの不確実性を入力機能に有意義に帰属させることにより、これらの不確実性を精査する必要があることがよくあります。これは、解釈可能性の評価を改善するのに役立ちます。ただし、この目的のための効果的なフレームワークはほとんどありません。 SHAPや統合グラジエントなどの顕著性マスクを提供するための一般的な方法のバニラ形式は、不確実性のターゲット測定にうまく適応しません。したがって、最先端のツールは、代わりに、反事実的または敵対的な特徴ベクトルを作成することによって進行し、元の画像と直接比較することによって属性を割り当てます。この論文では、観察可能なアーティファクトやノイズをほとんど含まない属性を取得するために、経路積分、反事実的説明、生成モデルを組み合わせた新しいフレームワークを紹介します。一般的なベンチマーク手法とさまざまな複雑さのデータセットを使用した定量的評価を通じて、これが既存の代替案よりも優れていることを証明します。

Predictive uncertainties in classification tasks are often a consequence of model inadequacy or insufficient training data. In popular applications, such as image processing, we are often required to scrutinise these uncertainties by meaningfully attributing them to input features. This helps to improve interpretability assessments. However, there exist few effective frameworks for this purpose. Vanilla forms of popular methods for the provision of saliency masks, such as SHAP or integrated gradients, adapt poorly to target measures of uncertainty. Thus, state-of-the-art tools instead proceed by creating counterfactual or adversarial feature vectors, and assign attributions by direct comparison to original images. In this paper, we present a novel framework that combines path integrals, counterfactual explanations and generative models, in order to procure attributions that contain few observable artefacts or noise. We evidence that this outperforms existing alternatives through quantitative evaluations with popular benchmarking methods and data sets of varying complexity.

updated: Wed Jun 08 2022 16:10:21 GMT+0000 (UTC)

published: Mon Jul 19 2021 11:07:34 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト