Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations

Pau Rodriguez; Massimo Caccia; Alexandre Lacoste; Lee Zamparo; Issam Laradji; Laurent Charlin; David Vazquez

多様な価値のある説明を伴う些細な反事実的説明を超えて

機械学習モデルの説明性は、より信頼性の高い機械学習システムを導入することの重要性を考えると、私たちの研究コミュニティ内でかなりの注目を集めています。コンピュータビジョンアプリケーションでは、生成的反事実的方法は、モデルの入力を摂動させて予測を変更する方法を示し、モデルの意思決定に関する詳細を提供します。現在の反事実的方法は、モデルの決定の単一の反事実的解釈でモデルとデータの複数のバイアスを組み合わせるため、あいまいな解釈を行います。さらに、これらの方法は、分類されている属性の存在を誇張または削除することを示唆することが多いため、モデルの決定について些細な反事実を生成する傾向があります。機械学習の実践者にとって、これらのタイプの反事実は、望ましくないモデルやデータのバイアスに関する新しい情報を提供しないため、ほとんど価値がありません。この作業では、モデルの予測に関する複数の貴重な説明を明らかにするために、多様性を強制する損失を使用して制約された、解きほぐされた潜在空間の摂動を学習する反事実的方法を提案します。さらに、モデルが些細な説明を生成しないようにするメカニズムを紹介します。 CelebAとSynbolsでの実験は、私たちのモデルが、以前の最先端の方法と比較して、高品質で価値のある説明を作成する成功率を向上させることを示しています。コードを公開します。

Explainability for machine learning models has gained considerable attention within our research community given the importance of deploying more reliable machine-learning systems. In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction, providing details about the model's decision-making. Current counterfactual methods make ambiguous interpretations as they combine multiple biases of the model and the data in a single counterfactual interpretation of the model's decision. Moreover, these methods tend to generate trivial counterfactuals about the model's decision, as they often suggest to exaggerate or remove the presence of the attribute being classified. For the machine learning practitioner, these types of counterfactuals offer little value, since they provide no new information about undesired model or data biases. In this work, we propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss to uncover multiple valuable explanations about the model's prediction. Further, we introduce a mechanism to prevent the model from producing trivial explanations. Experiments on CelebA and Synbols demonstrate that our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods. We will publish the code.

updated: Thu Mar 18 2021 12:57:34 GMT+0000 (UTC)

published: Thu Mar 18 2021 12:57:34 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト