Defining and Quantifying the Emergence of Sparse Concepts in DNNs

Jie Ren; Mingjie Li; Qirui Chen; Huiqi Deng; Quanshi Zhang

DNN におけるスパース概念の出現の定義と定量化

この論文は、訓練された DNN における概念の出現現象を説明することを目的としています。具体的には、DNN の推論スコアは、いくつかのインタラクティブな概念の効果に分解できることがわかりました。これらの概念は、DNN を説明する疎なシンボリック因果グラフの因果パターンとして理解できます。このような因果グラフを使用して DNN を説明することの忠実性は、理論的に保証されています。これは、因果グラフが指数関数的な数の異なるマスクされたサンプルで DNN の出力をよく模倣できることを証明しているためです。さらに、このような因果グラフは、説明の精度をあまり失うことなく、さらに単純化して And-Or グラフ (AOG) として書き直すことができます。

This paper aims to illustrate the concept-emerging phenomenon in a trained DNN. Specifically, we find that the inference score of a DNN can be disentangled into the effects of a few interactive concepts. These concepts can be understood as causal patterns in a sparse, symbolic causal graph, which explains the DNN. The faithfulness of using such a causal graph to explain the DNN is theoretically guaranteed, because we prove that the causal graph can well mimic the DNN's outputs on an exponential number of different masked samples. Besides, such a causal graph can be further simplified and re-written as an And-Or graph (AOG), without losing much explanation accuracy.

updated: Mon Apr 03 2023 12:02:02 GMT+0000 (UTC)

published: Thu Nov 11 2021 13:48:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト