EPSANet: An Efficient Pyramid Squeeze Attention Block on Convolutional Neural Network

Hu Zhang; Keke Zu; Jian Lu; Yuru Zou; Deyu Meng

EPSANet：畳み込みニューラルネットワーク上の効率的なピラミッドスクイーズ注意ブロック

最近、深い畳み込みニューラルネットワークのパフォーマンスは、それに注意モジュールを埋め込むことによって効果的に改善できることが実証されました。この作業では、ピラミッドスクイーズ注意（PSA）モジュールと呼ばれる新しい軽量で効果的な注意方法を提案します。 ResNetのボトルネックブロックで3x3畳み込みをPSAモジュールに置き換えることにより、Efficient Pyramid Squeeze Attention（EPSA）という名前の新しい表現ブロックが得られます。 EPSAブロックは、プラグアンドプレイコンポーネントとして確立されたバックボーンネットワークに簡単に追加でき、モデルのパフォーマンスを大幅に向上させることができます。したがって、EPSANetという名前のシンプルで効率的なバックボーンアーキテクチャは、これらのResNetスタイルのEPSAブロックをスタックすることによってこの作業で開発されます。これに対応して、提案されたEPSANetは、画像分類、オブジェクト検出、インスタンスのセグメンテーションなどを含むがこれらに限定されないさまざまなコンピュータビジョンタスクに対して、より強力なマルチスケール表現機能を提供できます。ベルやホイッスルがなければ、提案されたEPSANetのパフォーマンスは優れています。最先端のチャネル注意方法のほとんど。 SENet-50と比較して、Top-1の精度はImageNetデータセットで1.93％向上し、オブジェクト検出用の+2.7ボックスAPのマージンが大きくなり、Mask-RCNNを使用したインスタンスセグメンテーション用の+1.7マスクAPが向上します。 MS-COCOデータセットで取得されます。ソースコードはhttps://github.com/murufeng/EPSANetで入手できます。

Recently, it has been demonstrated that the performance of a deep convolutional neural network can be effectively improved by embedding an attention module into it. In this work, a novel lightweight and effective attention method named Pyramid Squeeze Attention (PSA) module is proposed. By replacing the 3x3 convolution with the PSA module in the bottleneck blocks of the ResNet, a novel representational block named Efficient Pyramid Squeeze Attention (EPSA) is obtained. The EPSA block can be easily added as a plug-and-play component into a well-established backbone network, and significant improvements on model performance can be achieved. Hence, a simple and efficient backbone architecture named EPSANet is developed in this work by stacking these ResNet-style EPSA blocks. Correspondingly, a stronger multi-scale representation ability can be offered by the proposed EPSANet for various computer vision tasks including but not limited to, image classification, object detection, instance segmentation, etc. Without bells and whistles, the performance of the proposed EPSANet outperforms most of the state-of-the-art channel attention methods. As compared to the SENet-50, the Top-1 accuracy is improved by 1.93% on ImageNet dataset, a larger margin of +2.7 box AP for object detection and an improvement of +1.7 mask AP for instance segmentation by using the Mask-RCNN on MS-COCO dataset are obtained. Our source code is available at:https://github.com/murufeng/EPSANet.

updated: Thu Jul 22 2021 13:08:44 GMT+0000 (UTC)

published: Sun May 30 2021 07:26:41 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト