This Looks Like That... Does it? Shortcomings of Latent Space Prototype Interpretability in Deep Networks

Adrian Hoffmann; Claudio Fanconi; Rahul Rade; Jonas Kohler

これはそのように見えます...そうですか？ディープネットワークにおける潜在空間プロトタイプの解釈可能性の欠点

アーキテクチャ設計によって人間が解釈できる決定をもたらすディープニューラルネットワークは、最近、従来のブラックボックスモデルの事後解釈に代わるものとしてますます人気が高まっています。これらのネットワークの中で、おそらく最も普及しているアプローチは、いわゆるプロトタイプ学習であり、学習された潜在プロトタイプとの類似性が、見えないデータポイントを分類する基礎として機能します。この作業では、そのようなアプローチの重要な欠点を指摘します。つまり、潜在空間の類似性と入力空間の類似性の間に意味的なギャップがあり、解釈可能性を損なう可能性があります。いわゆるProtoPNetでこの問題を例示する2つの実験を設計します。具体的には、このネットワークの解釈可能性メカニズムは、意図的に作成された、またはJPEG圧縮アーティファクトによってさえも誤って導かれる可能性があり、理解できない決定をもたらす可能性があることがわかりました。プロトタイプベースのモデルを実際に展開する場合、実務家はこの欠点を念頭に置く必要があると主張します。

Deep neural networks that yield human interpretable decisions by architectural design have lately become an increasingly popular alternative to post hoc interpretation of traditional black-box models. Among these networks, the arguably most widespread approach is so-called prototype learning, where similarities to learned latent prototypes serve as the basis of classifying an unseen data point. In this work, we point to an important shortcoming of such approaches. Namely, there is a semantic gap between similarity in latent space and similarity in input space, which can corrupt interpretability. We design two experiments that exemplify this issue on the so-called ProtoPNet. Specifically, we find that this network's interpretability mechanism can be led astray by intentionally crafted or even JPEG compression artefacts, which can produce incomprehensible decisions. We argue that practitioners ought to have this shortcoming in mind when deploying prototype-based models in practice.

updated: Mon May 10 2021 08:48:08 GMT+0000 (UTC)

published: Wed May 05 2021 12:28:34 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト