Do Feature Attribution Methods Correctly Attribute Features?

Yilun Zhou; Serena Booth; Marco Tulio Ribeiro; Julie Shah

機能属性メソッドは機能を正しく属性付けしていますか？

特徴帰属法は、解釈可能な機械学習で人気があります。これらの方法は、各入力特徴の帰属を計算してその重要性を表しますが、「帰属」の定義に関するコンセンサスがないため、体系的な評価がほとんどなく、特にグラウンドトゥルースの帰属がないために複雑な多くの競合する方法につながります。これに対処するために、このようなグラウンドトゥルースを誘発するためのデータセット変更手順を提案します。この手順を使用して、顕著性マップ、理論的根拠、および注意という3つの一般的な方法を評価します。いくつかの欠陥を特定し、実際のデータセットに適用されたこれらの方法の正確性と信頼性を疑問視する証拠の増加に新しい視点を追加します。さらに、解決策の可能な方法について説明し、展開前にグラウンドトゥルースに対してテストする新しい帰属方法を推奨します。コードはhttps://github.com/YilunZhou/feature-attribution-evaluationで入手できます。

Feature attribution methods are popular in interpretable machine learning. These methods compute the attribution of each input feature to represent its importance, but there is no consensus on the definition of "attribution", leading to many competing methods with little systematic evaluation, complicated in particular by the lack of ground truth attribution. To address this, we propose a dataset modification procedure to induce such ground truth. Using this procedure, we evaluate three common methods: saliency maps, rationales, and attentions. We identify several deficiencies and add new perspectives to the growing body of evidence questioning the correctness and reliability of these methods applied on datasets in the wild. We further discuss possible avenues for remedy and recommend new attribution methods to be tested against ground truth before deployment. The code is available at https://github.com/YilunZhou/feature-attribution-evaluation.

updated: Wed Dec 15 2021 16:30:39 GMT+0000 (UTC)

published: Tue Apr 27 2021 20:35:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト