Group-CAM: Group Score-Weighted Visual Explanations for Deep Convolutional Networks

Qinglong Zhang; Lu Rao; Yubin Yang

Group-CAM：深い畳み込みネットワークのグループスコア加重視覚的説明

本論文では、顕著性マップを生成するために「分割-変換-マージ」戦略を採用する、グループスコア加重クラスアクティベーションマッピング（Group-CAM）と呼ばれる効率的な顕著性マップ生成方法を提案します。具体的には、入力画像の場合、クラスのアクティブ化は最初にグループに分割されます。各グループでは、サブアクティベーションが合計され、初期マスクとしてノイズ除去されます。その後、初期マスクは意味のある摂動で変換され、入力のサブピクセル（つまり、マスクされた入力）を保持するために適用され、ネットワークに入力されて信頼スコアが計算されます。最後に、最初のマスクが重み付けされて合計され、最終的な顕著性マップが形成されます。ここで、重みは、マスクされた入力によって生成された信頼スコアです。 Group-CAMは効率的でありながら効果的であり、ターゲット関連の顕著性マップを作成しながら、ネットワークへの数十のクエリのみを必要とします。その結果、Group-CAMは、ネットワークを微調整するための効果的なデータ拡張トリックとして機能します。 ImageNet-1kでの削除と挿入のテスト、COCO2017でのポインティングゲームのテストなど、一般的に使用されるベンチマークでGroup-CAMのパフォーマンスを包括的に評価します。広範な実験結果は、Group-CAMが現在の最先端の説明アプローチよりも優れた視覚性能を達成することを示しています。コードはhttps://github.com/wofmanaf/Group-CAMで入手できます。

In this paper, we propose an efficient saliency map generation method, called Group score-weighted Class Activation Mapping (Group-CAM), which adopts the "split-transform-merge" strategy to generate saliency maps. Specifically, for an input image, the class activations are firstly split into groups. In each group, the sub-activations are summed and de-noised as an initial mask. After that, the initial masks are transformed with meaningful perturbations and then applied to preserve sub-pixels of the input (i.e., masked inputs), which are then fed into the network to calculate the confidence scores. Finally, the initial masks are weighted summed to form the final saliency map, where the weights are confidence scores produced by the masked inputs. Group-CAM is efficient yet effective, which only requires dozens of queries to the network while producing target-related saliency maps. As a result, Group-CAM can be served as an effective data augment trick for fine-tuning the networks. We comprehensively evaluate the performance of Group-CAM on common-used benchmarks, including deletion and insertion tests on ImageNet-1k, and pointing game tests on COCO2017. Extensive experimental results demonstrate that Group-CAM achieves better visual performance than the current state-of-the-art explanation approaches. The code is available at https://github.com/wofmanaf/Group-CAM.

updated: Sat Jun 19 2021 09:40:17 GMT+0000 (UTC)

published: Thu Mar 25 2021 14:16:02 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト