An Attention Module for Convolutional Neural Networks

Zhu Baozhou; Peter Hofstee; Jinho Lee; Zaid Al-Ars

畳み込みニューラルネットワークの注意モジュール

注意メカニズムは、長距離の特徴の相互作用をキャプチャし、畳み込みニューラルネットワークの表現機能を強化するための高度な手法と見なされてきました。ただし、現在の注意活性化ベースのモデルでは、近似問題と注意マップの容量不足の問題という2つの無視された問題が見つかりました。 2つの問題を一緒に解決するために、最初にAW畳み込みを開発することにより、畳み込みニューラルネットワークの注意モジュールを提案します。ここで、注意マップの形状は、アクティブ化ではなく重みの形状と一致します。私たちが提案する注意モジュールは、チャネルごとの特徴と空間的特徴との関係を調査するために注意メカニズムを適用するものなど、以前の注意ベースのスキームを補完する方法です。画像分類およびオブジェクト検出タスクのためのいくつかのデータセットでの実験は、提案された注意モジュールの有効性を示しています。特に、提案されたアテンションモジュールは、ResNet101ベースラインよりもImageNet分類で1.00％のトップ1精度の向上を達成し、ResNet101-のバックボーンを備えたより高速なR-CNNベースライン上でのCOCOオブジェクト検出で0.63COCOスタイルの平均精度の向上を実現します。 FPN。以前のアテンションアクティベーションベースのモデルと統合すると、提案されたアテンションモジュールは、ImageNet分類でのトップ1の精度を最大0.57％、COCOオブジェクト検出でのCOCOスタイルの平均精度を最大0.45までさらに向上させることができます。コードと事前トレーニング済みモデルは公開されます。

Attention mechanism has been regarded as an advanced technique to capture long-range feature interactions and to boost the representation capability for convolutional neural networks. However, we found two ignored problems in current attentional activations-based models: the approximation problem and the insufficient capacity problem of the attention maps. To solve the two problems together, we initially propose an attention module for convolutional neural networks by developing an AW-convolution, where the shape of attention maps matches that of the weights rather than the activations. Our proposed attention module is a complementary method to previous attention-based schemes, such as those that apply the attention mechanism to explore the relationship between channel-wise and spatial features. Experiments on several datasets for image classification and object detection tasks show the effectiveness of our proposed attention module. In particular, our proposed attention module achieves 1.00% Top-1 accuracy improvement on ImageNet classification over a ResNet101 baseline and 0.63 COCO-style Average Precision improvement on the COCO object detection on top of a Faster R-CNN baseline with the backbone of ResNet101-FPN. When integrating with the previous attentional activations-based models, our proposed attention module can further increase their Top-1 accuracy on ImageNet classification by up to 0.57% and COCO-style Average Precision on the COCO object detection by up to 0.45. Code and pre-trained models will be publicly available.

updated: Wed Aug 18 2021 15:36:18 GMT+0000 (UTC)

published: Wed Aug 18 2021 15:36:18 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト