Explaining Deep Convolutional Neural Networks via Latent Visual-Semantic Filter Attention

Yu Yang; Seungbae Kim; Jungseock Joo

潜在的視覚意味フィルター注意による深い畳み込みニューラルネットワークの説明

解釈可能性は、研究者やユーザーが複雑なモデルの内部メカニズムを理解するのに役立つため、視覚モデルにとって重要な特性です。ただし、学習した表現に関する意味論的説明を生成することは、そのような説明を生成するための直接の監督なしでは困難です。一般的なフレームワークである潜在視覚意味説明（LaViSE）を提案し、既存の畳み込みニューラルネットワークに、フィルターレベルでの潜在表現に関するテキスト記述を生成するように教えます。私たちのメソッドは、画像とカテゴリ名を使用して、一般的な画像データセットを使用して、視覚空間と意味空間の間のマッピングを構築します。次に、セマンティックラベルを持たないターゲットドメインにマッピングを転送します。提案されたフレームワークはモジュラー構造を採用しており、元のトレーニングデータが利用可能かどうかにかかわらず、トレーニングされたネットワークを分析できます。この方法では、トレーニングデータセットで定義されたカテゴリのセットを超えて、学習したフィルターの新しい記述を生成し、複数のデータセットに対して広範な評価を実行できることを示します。また、教師なしデータセットバイアス分析の新しいアプリケーションを示します。これにより、データセット内の隠れたバイアスを自動的に検出したり、追加のラベルを使用せずにさまざまなサブセットを比較したりできます。データセットとコードは、さらなる調査を容易にするために公開されています。

Interpretability is an important property for visual models as it helps researchers and users understand the internal mechanism of a complex model. However, generating semantic explanations about the learned representation is challenging without direct supervision to produce such explanations. We propose a general framework, Latent Visual Semantic Explainer (LaViSE), to teach any existing convolutional neural network to generate text descriptions about its own latent representations at the filter level. Our method constructs a mapping between the visual and semantic spaces using generic image datasets, using images and category names. It then transfers the mapping to the target domain which does not have semantic labels. The proposed framework employs a modular structure and enables to analyze any trained network whether or not its original training data is available. We show that our method can generate novel descriptions for learned filters beyond the set of categories defined in the training dataset and perform an extensive evaluation on multiple datasets. We also demonstrate a novel application of our method for unsupervised dataset bias analysis which allows us to automatically discover hidden biases in datasets or compare different subsets without using additional labels. The dataset and code are made public to facilitate further research.

updated: Sun Apr 10 2022 04:57:56 GMT+0000 (UTC)

published: Sun Apr 10 2022 04:57:56 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト