Generic Attention-model Explainability by Weighted Relevance Accumulation

Yiming Huang; Aozhe Jia; Xiaodan Zhang; Jiawei Zhang

重み付けされた関連性の累積による一般的な注意モデルの説明可能性

注意ベースのトランスフォーマーモデルは、視覚的な質問応答などのマルチモーダルタスクにおいて目覚ましい進歩を遂げました。アテンションベースの手法の説明可能性は、アテンション層全体で関連性を蓄積することでアテンショントークンの内部変化を説明できるため、最近幅広い関心を集めています。現在の方法では、アテンションプロセスの前後でトークンの関連性を均等に蓄積することにより、単純に関連性を更新します。ただし、トークン値の重要性は通常、関連性の蓄積中に異なります。この論文では、関連性を均等に蓄積する際の歪みを軽減するために、トークン値の重要性を考慮した重み付けされた関連性戦略を提案します。私たちの方法を評価するために、CLIP エンコーダーと後続のマッパーを通じて視覚と言語のタスクを処理するための、CLIPmapper という名前の統合 CLIP ベースの 2 段階モデルを提案します。 CLIPmapper は、自己注意、交差注意、単一モダリティ、およびクロスモダリティの注意で構成されているため、一般的な説明可能性手法を評価するのにより適しています。視覚的な質問応答と画像キャプションに関する広範な摂動テストにより、説明可能性手法が既存の手法よりも優れていることが検証されました。

Attention-based transformer models have achieved remarkable progress in multi-modal tasks, such as visual question answering. The explainability of attention-based methods has recently attracted wide interest as it can explain the inner changes of attention tokens by accumulating relevancy across attention layers. Current methods simply update relevancy by equally accumulating the token relevancy before and after the attention processes. However, the importance of token values is usually different during relevance accumulation. In this paper, we propose a weighted relevancy strategy, which takes the importance of token values into consideration, to reduce distortion when equally accumulating relevance. To evaluate our method, we propose a unified CLIP-based two-stage model, named CLIPmapper, to process Vision-and-Language tasks through CLIP encoder and a following mapper. CLIPmapper consists of self-attention, cross-attention, single-modality, and cross-modality attention, thus it is more suitable for evaluating our generic explainability method. Extensive perturbation tests on visual question answering and image captioning validate that our explainability method outperforms existing methods.

updated: Sun Aug 20 2023 12:02:30 GMT+0000 (UTC)

published: Sun Aug 20 2023 12:02:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト