Evaluating the Stability of Semantic Concept Representations in CNNs for Robust Explainability

Georgii Mikriukov; Gesina Schwalbe; Christian Hellert; Korinna Bade

ロバストな説明可能性のための CNN における意味論的概念表現の安定性の評価

畳み込みニューラルネットワーク (CNN) 内で意味概念がどのように表現されるかを分析することは、説明可能な人工知能 (XAI) で CNN を解釈するために広く使用されているアプローチです。動機は、自動運転などのさまざまな分野で義務付けられている、安全性が重要な AI ベースのシステムにおける透明性の必要性です。ただし、検査やエラー検索などの安全関連の目的で概念表現を使用するには、これらが高品質で、特に安定している必要があります。このホワイトペーパーでは、コンピュータービジョン CNN で概念表現を操作する際の 2 つの安定性の目標に焦点を当てています。それは、概念検索の安定性と概念帰属の安定性です。ガイドとなるユースケースは、オブジェクト検出 (OD) CNN の事後説明可能性フレームワークであり、既存の概念分析 (CA) メソッドがうまく適応されています。概念検索の安定性に対処するために、概念の分離と一貫性の両方を考慮し、レイヤーと概念表現の次元にとらわれない新しいメトリックを提案します。次に、概念の抽象化レベル、概念トレーニングサンプルの数、CNN サイズ、および概念表現の次元が安定性に与える影響を調査します。概念の帰属の安定性については、勾配ベースの説明可能性手法に対する勾配の不安定性の影響を調査します。分類とオブジェクト検出のためのさまざまな CNN の結果は、(1) データ集約による次元削減によって概念検索の安定性を高めることができること、および (2) 勾配の不安定性がより顕著である浅いレイヤーでは、勾配平滑化技術を使用することの主な発見をもたらします。アドバイスされます。最後に、私たちのアプローチは、適切なレイヤーと概念表現の次元を選択するための貴重な洞察を提供し、安全性が重要な XAI アプリケーションで CA への道を開きます。

Analysis of how semantic concepts are represented within Convolutional Neural Networks (CNNs) is a widely used approach in Explainable Artificial Intelligence (XAI) for interpreting CNNs. A motivation is the need for transparency in safety-critical AI-based systems, as mandated in various domains like automated driving. However, to use the concept representations for safety-relevant purposes, like inspection or error retrieval, these must be of high quality and, in particular, stable. This paper focuses on two stability goals when working with concept representations in computer vision CNNs: stability of concept retrieval and of concept attribution. The guiding use-case is a post-hoc explainability framework for object detection (OD) CNNs, towards which existing concept analysis (CA) methods are successfully adapted. To address concept retrieval stability, we propose a novel metric that considers both concept separation and consistency, and is agnostic to layer and concept representation dimensionality. We then investigate impacts of concept abstraction level, number of concept training samples, CNN size, and concept representation dimensionality on stability. For concept attribution stability we explore the effect of gradient instability on gradient-based explainability methods. The results on various CNNs for classification and object detection yield the main findings that (1) the stability of concept retrieval can be enhanced through dimensionality reduction via data aggregation, and (2) in shallow layers where gradient instability is more pronounced, gradient smoothing techniques are advised. Finally, our approach provides valuable insights into selecting the appropriate layer and concept representation dimensionality, paving the way towards CA in safety-critical XAI applications.

updated: Fri Apr 28 2023 14:14:00 GMT+0000 (UTC)

published: Fri Apr 28 2023 14:14:00 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト