Dividing and Conquering a BlackBox to a Mixture of Interpretable Models: Route, Interpret, Repeat

Shantanu Ghosh; Ke Yu; Forough Arabshahi; Kayhan Batmanghelich

解釈可能なモデルの混合物へのブラックボックスの分割と征服: ルーティング、解釈、繰り返し

ML モデルの設計は、解釈可能なモデルまたはブラックボックスから開始し、事後的に説明します。ブラックボックスモデルは柔軟ですが、説明が困難ですが、解釈可能なモデルは本質的に説明可能です。しかし、解釈可能なモデルには広範な ML の知識が必要であり、Blackbox バリアントよりも柔軟性が低く、パフォーマンスが低い傾向があります。この論文は、ブラックボックスの事後的な説明と解釈可能なモデルの構築との区別をあいまいにすることを目的としています。ブラックボックスから始めて、解釈可能な専門家 (MoIE) と残差ネットワークの混合物を繰り返し切り出します。解釈可能な各モデルは、サンプルのサブセットに特化し、一次論理 (FOL) を使用してそれらを説明し、ブラックボックスの概念に関する基本的な推論を提供します。残りのサンプルは柔軟な残差を介してルーティングします。すべての解釈可能なモデルがデータの望ましい割合を説明するまで、残差ネットワークでこの方法を繰り返します。私たちの大規模な実験は、私たちのルーティング、解釈、および反復アプローチが (1) パフォーマンスを犠牲にすることなく、MoIE を介して概念の完全性が高いインスタンス固有の概念の多様なセットを識別し、(2) 説明するのが比較的「難しい」サンプルを識別することを示しています。残差を介して、(3) テスト時間の介入中に、解釈可能な設計によるモデルよりも大幅に優れており、(4) 元のブラックボックスによって学習された近道を修正します。 MoIE のコードは、https://github.com/batmanlab/ICML-2023-Route-interpret-repeat で公開されています。

ML model design either starts with an interpretable model or a Blackbox and explains it post hoc. Blackbox models are flexible but difficult to explain, while interpretable models are inherently explainable. Yet, interpretable models require extensive ML knowledge and tend to be less flexible and underperforming than their Blackbox variants. This paper aims to blur the distinction between a post hoc explanation of a Blackbox and constructing interpretable models. Beginning with a Blackbox, we iteratively carve out a mixture of interpretable experts (MoIE) and a residual network. Each interpretable model specializes in a subset of samples and explains them using First Order Logic (FOL), providing basic reasoning on concepts from the Blackbox. We route the remaining samples through a flexible residual. We repeat the method on the residual network until all the interpretable models explain the desired proportion of data. Our extensive experiments show that our route, interpret, and repeat approach (1) identifies a diverse set of instance-specific concepts with high concept completeness via MoIE without compromising in performance, (2) identifies the relatively ``harder'' samples to explain via residuals, (3) outperforms the interpretable by-design models by significant margins during test-time interventions, and (4) fixes the shortcut learned by the original Blackbox. The code for MoIE is publicly available at: https://github.com/batmanlab/ICML-2023-Route-interpret-repeat.

updated: Thu Apr 27 2023 18:40:36 GMT+0000 (UTC)

published: Mon Feb 20 2023 20:25:41 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト