Explainers in the Wild: Making Surrogate Explainers Robust to Distortions through Perception

Alexander Hepburn; Raul Santos-Rodriguez

野生の説明者：代理説明者を知覚による歪みに対してロバストにする

モデルの決定を説明することは、事後的な方法を使用することによってであろうと、本質的に解釈可能なモデルを作成することによってであろうと、画像処理領域で普及しつつあります。代理説明者の広範な使用は、ブラックボックスモデルを検査および理解するための歓迎すべき追加ですが、説明の堅牢性と信頼性を評価することは、彼らの成功の鍵です。さらに、説明可能性の分野での既存の研究は、この問題に対処するためのさまざまな戦略を提案していますが、実際のデータを扱うという課題は見過ごされがちです。たとえば、画像分類では、画像の歪みはモデルによって割り当てられた予測だけでなく、説明にも影響を与える可能性があります。画像のクリーンで歪んだバージョンを考えると、予測確率が類似していても、説明は異なる場合があります。この論文では、代理説明者の訓練に使用される近隣を調整する知覚距離を埋め込むことにより、説明の歪みの影響を評価する方法論を提案します。また、このように操作することで、説明を歪みに対してよりロバストにすることができることも示します。 Imagenet-Cデータセット内の画像の説明を生成し、代理説明者で知覚距離を使用すると、歪んだ画像と参照画像のより一貫性のある説明がどのように作成されるかを示します。

Explaining the decisions of models is becoming pervasive in the image processing domain, whether it is by using post-hoc methods or by creating inherently interpretable models. While the widespread use of surrogate explainers is a welcome addition to inspect and understand black-box models, assessing the robustness and reliability of the explanations is key for their success. Additionally, whilst existing work in the explainability field proposes various strategies to address this problem, the challenges of working with data in the wild is often overlooked. For instance, in image classification, distortions to images can not only affect the predictions assigned by the model, but also the explanation. Given a clean and a distorted version of an image, even if the prediction probabilities are similar, the explanation may still be different. In this paper we propose a methodology to evaluate the effect of distortions in explanations by embedding perceptual distances that tailor the neighbourhoods used to training surrogate explainers. We also show that by operating in this way, we can make the explanations more robust to distortions. We generate explanations for images in the Imagenet-C dataset and demonstrate how using a perceptual distances in the surrogate explainer creates more coherent explanations for the distorted and reference images.

updated: Wed Jun 16 2021 10:39:04 GMT+0000 (UTC)

published: Mon Feb 22 2021 12:38:53 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト