Invertible Concept-based Explanations for CNN Models with Non-negative Concept Activation Vectors

Ruihan Zhang; Prashan Madumal; Tim Miller; Krista A. Ehinger; Benjamin I. P. Rubinstein

非負の概念活性化ベクトルを使用したCNNモデルの可逆概念ベースの説明

コンピュータビジョンの畳み込みニューラルネットワーク（CNN）モデルは強力ですが、最も基本的な形式では説明性に欠けています。この欠陥は、重要なドメインにCNNを適用する際の重要な課題です。近似線形モデルの特徴の重要性を説明する最近の作業は、入力レベルの特徴（ピクセルまたはセグメント）から、概念活性化ベクトル（CAV）の形式の中間層特徴マップからの特徴に移動しました。 CAVには概念レベルの情報が含まれており、クラスタリングを介して学習できます。この作業では、Ghorbani et al。のACEアルゴリズムを再考し、その欠点を克服するための代替の不可避の概念ベースの説明（ICE）フレームワークを提案します。忠実度（ターゲットモデルに近似するモデル）と解釈可能性（人々にとって意味のあるもの）の要件に基づいて、測定値を設計し、フレームワークを使用してさまざまな行列因数分解法を評価します。非負行列因子分解からの非負概念活性化ベクトル（NCAV）は、計算および人体実験に基づいて、解釈可能性と忠実度において優れたパフォーマンスを提供することがわかります。私たちのフレームワークは、事前にトレーニングされたCNNモデルのローカルおよびグローバルの概念レベルの説明を提供します。

Convolutional neural network (CNN) models for computer vision are powerful but lack explainability in their most basic form. This deficiency remains a key challenge when applying CNNs in important domains. Recent work for explanations through feature importance of approximate linear models has moved from input-level features (pixels or segments) to features from mid-layer feature maps in the form of concept activation vectors (CAVs). CAVs contain concept-level information and could be learnt via clustering. In this work, we rethink the ACE algorithm of Ghorbani et al., proposing an alternative inevitable concept-based explanation (ICE) framework to overcome its shortcomings. Based on the requirements of fidelity (approximate models to target models) and interpretability (being meaningful to people), we design measurements and evaluate a range of matrix factorization methods with our framework. We find that non-negative concept activation vectors (NCAVs) from non-negative matrix factorization provide superior performance in interpretability and fidelity based on computational and human subject experiments. Our framework provides both local and global concept-level explanations for pre-trained CNN models.

updated: Thu Feb 04 2021 13:54:53 GMT+0000 (UTC)

published: Sat Jun 27 2020 17:57:26 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト