Leveraging Conditional Generative Models in a General Explanation Framework of Classifier Decisions

Martin Charachon; Paul-Henry Cournède; Céline Hudelot; Roberto Ardon

分類子決定の一般的な説明フレームワークでの条件付き生成モデルの活用

分類子の決定について人間が理解できる説明を提供することは、日常のタスクでの分類子の使用に対する信頼を生み出すために不可欠になっています。多くの作品が視覚的な説明マップを生成することによってこの問題に取り組んでいますが、それらはしばしばノイズが多く不正確な結果を提供し、問題の分類器とは無関係のヒューリスティック正則化の使用を余儀なくされます。本論文では、これらの制限を克服する視覚的説明問題の新しい一般的な視点を提案する。視覚的な説明は、2つの特定の条件付き生成モデルを介して取得された2つの生成された画像の違いとして生成できることを示します。両方の生成モデルは、説明する分類子と次のプロパティを適用するデータベースを使用してトレーニングされます。（i）最初のジェネレーターによって生成されたすべての画像は入力画像と同様に分類されますが、2番目のジェネレーターの出力は反対に分類されます。（ii）生成された画像は、実画像の分布に属します。（iii）入力画像と対応する生成された画像の間の距離は最小であるため、生成された要素間の差は、調査対象の分類器に関連する情報のみを明らかにします。対称制約と循環制約を使用して、一般的な定式化の2つの異なる近似と実装を示します。実験的に、3つの異なる公開データセットで最先端の機能が大幅に向上していることを示しています。特に、分類子に影響を与える領域のローカリゼーションは、人間の注釈と一致しています。

Providing a human-understandable explanation of classifiers' decisions has become imperative to generate trust in their use for day-to-day tasks. Although many works have addressed this problem by generating visual explanation maps, they often provide noisy and inaccurate results forcing the use of heuristic regularization unrelated to the classifier in question. In this paper, we propose a new general perspective of the visual explanation problem overcoming these limitations. We show that visual explanation can be produced as the difference between two generated images obtained via two specific conditional generative models. Both generative models are trained using the classifier to explain and a database to enforce the following properties: (i) All images generated by the first generator are classified similarly to the input image, whereas the second generator's outputs are classified oppositely. (ii) Generated images belong to the distribution of real images. (iii) The distances between the input image and the corresponding generated images are minimal so that the difference between the generated elements only reveals relevant information for the studied classifier. Using symmetrical and cyclic constraints, we present two different approximations and implementations of the general formulation. Experimentally, we demonstrate significant improvements w.r.t the state-of-the-art on three different public data sets. In particular, the localization of regions influencing the classifier is consistent with human annotations.

updated: Mon Jun 21 2021 09:41:54 GMT+0000 (UTC)

published: Mon Jun 21 2021 09:41:54 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト