Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations

Pau Rodriguez; Massimo Caccia; Alexandre Lacoste; Lee Zamparo; Issam Laradji; Laurent Charlin; David Vazquez

多様な価値のある説明を伴う些細な反事実的説明を超えて

機械学習モデルの説明可能性は、より信頼性の高い機械学習システムを導入することの重要性を考えると、研究コミュニティ内でかなりの注目を集めています。コンピュータビジョンアプリケーションでは、生成的反事実的方法は、モデルの入力を摂動させて予測を変更する方法を示し、モデルの意思決定に関する詳細を提供します。現在の方法は、分類されている属性の存在を誇張または削除することを提案することが多いため、モデルの決定について些細な反事実を生成する傾向があります。機械学習の実践者にとって、これらのタイプの反事実は、望ましくないモデルやデータのバイアスに関する新しい情報を提供しないため、ほとんど価値がありません。この作業では、些細な反事実生成の問題を特定し、それを軽減するためにDiVEを提案します。 DiVEは、モデルの予測に関する複数の貴重な説明を明らかにするために、多様性を強制する損失を使用して制約された、解きほぐされた潜在空間の摂動を学習します。さらに、モデルが些細な説明を生成しないようにするメカニズムを紹介します。 CelebAとSynbolsでの実験は、私たちのモデルが、以前の最先端の方法と比較して、高品質で価値のある説明を作成する成功率を向上させることを示しています。コードはhttps://github.com/ElementAI/beyond-trivial-explanationsで入手できます。

Explainability for machine learning models has gained considerable attention within the research community given the importance of deploying more reliable machine-learning systems. In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction, providing details about the model's decision-making. Current methods tend to generate trivial counterfactuals about a model's decisions, as they often suggest to exaggerate or remove the presence of the attribute being classified. For the machine learning practitioner, these types of counterfactuals offer little value, since they provide no new information about undesired model or data biases. In this work, we identify the problem of trivial counterfactual generation and we propose DiVE to alleviate it. DiVE learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss to uncover multiple valuable explanations about the model's prediction. Further, we introduce a mechanism to prevent the model from producing trivial explanations. Experiments on CelebA and Synbols demonstrate that our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods. Code is available at https://github.com/ElementAI/beyond-trivial-explanations.

updated: Thu Nov 11 2021 17:55:27 GMT+0000 (UTC)

published: Thu Mar 18 2021 12:57:34 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト