BR-NPA: A Non-Parametric High-Resolution Attention Model to improve the Interpretability of Attention

Tristan Gomez; Suiyi Ling; Thomas Fréour; Harold Mouchère

BR-NPA：注意の解釈可能性を改善するためのノンパラメトリック高解像度注意モデル

注意メカニズムを採用することの普及は、注意分布の解釈可能性に関する懸念をもたらしました。モデルがどのように動作しているかについての洞察を提供しますが、モデル予測の説明として注意を利用することは依然として非常に疑わしいです。コミュニティは、最終決定に最も貢献する地域の活動地域をより適切に特定するための、より解釈可能な戦略を依然として模索しています。既存の注意モデルの解釈可能性を改善するために、タスク関連の人間が解釈可能な情報をキャプチャする新しい双線形代表ノンパラメトリック注意（BR-NPA）戦略を提案します。ターゲットモデルは、最初に蒸留されて、より高解像度の中間特徴マップが作成されます。次に、代表的な特徴がローカルのペアワイズ特徴の類似性に基づいてグループ化され、入力のタスク関連部分を強調する、よりきめの細かい、より正確な注意マップが生成されます。得られたアテンションマップは、強調表示された領域の重要なレベルに関する情報を提供する複合機能のアクティビティレベルに従ってランク付けされます。提案されたモデルは、分類が含まれるさまざまな最新のディープモデルに簡単に適合させることができます。広範な定量的および定性的実験は、最新の注意モデルおよび視覚化手法と比較して、より包括的で正確な視覚的説明を、きめ細かい画像分類、数ショット分類、および人物の再識別を含む複数のタスクにわたって、妥協することなく示します。分類精度。提案された視覚化モデルは、ニューラルネットワークがさまざまなタスクでどのように「注意を払う」かについての必須の光を当てます。

The prevalence of employing attention mechanisms has brought along concerns on the interpretability of attention distributions. Although it provides insights about how a model is operating, utilizing attention as the explanation of model predictions is still highly dubious. The community is still seeking more interpretable strategies for better identifying local active regions that contribute the most to the final decision. To improve the interpretability of existing attention models, we propose a novel Bilinear Representative Non-Parametric Attention (BR-NPA) strategy that captures the task-relevant human-interpretable information. The target model is first distilled to have higher-resolution intermediate feature maps. From which, representative features are then grouped based on local pairwise feature similarity, to produce finer-grained, more precise attention maps highlighting task-relevant parts of the input. The obtained attention maps are ranked according to the activity level of the compound feature, which provides information regarding the important level of the highlighted regions. The proposed model can be easily adapted in a wide variety of modern deep models, where classification is involved. Extensive quantitative and qualitative experiments showcase more comprehensive and accurate visual explanations compared to state-of-the-art attention models and visualizations methods across multiple tasks including fine-grained image classification, few-shot classification, and person re-identification, without compromising the classification accuracy. The proposed visualization model sheds imperative light on how neural networks `pay their attention' differently in different tasks.

updated: Mon Jan 31 2022 14:50:41 GMT+0000 (UTC)

published: Fri Jun 04 2021 15:57:37 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト