Residual Attention: A Simple but Effective Method for Multi-Label Recognition

Ke Zhu; Jianxin Wu

残余の注意：マルチラベル認識のためのシンプルだが効果的な方法

マルチラベル画像認識は、実用化の難しいコンピュータビジョンタスクです。ただし、この分野の進歩は、複雑な方法、大量の計算、直感的な説明の欠如を特徴とすることがよくあります。さまざまなカテゴリのオブジェクトが占めるさまざまな空間領域を効果的にキャプチャするために、クラス固有の残余注意（CSRA）という名前の恥ずかしいほど単純なモジュールを提案します。 CSRAは、単純な空間的注意スコアを提案することにより、すべてのカテゴリのクラス固有の機能を生成し、それをクラスにとらわれない平均プーリング機能と組み合わせます。 CSRAは、マルチラベル認識で最先端の結果を達成すると同時に、それらよりもはるかに簡単です。さらに、わずか4行のコードで、CSRAは、追加のトレーニングなしで、多くの多様な事前トレーニング済みモデルおよびデータセット全体で一貫した改善をもたらします。 CSRAは、実装が簡単で、計算が簡単で、直感的な説明と視覚化も楽しめます。

Multi-label image recognition is a challenging computer vision task of practical use. Progresses in this area, however, are often characterized by complicated methods, heavy computations, and lack of intuitive explanations. To effectively capture different spatial regions occupied by objects from different categories, we propose an embarrassingly simple module, named class-specific residual attention (CSRA). CSRA generates class-specific features for every category by proposing a simple spatial attention score, and then combines it with the class-agnostic average pooling feature. CSRA achieves state-of-the-art results on multilabel recognition, and at the same time is much simpler than them. Furthermore, with only 4 lines of code, CSRA also leads to consistent improvement across many diverse pretrained models and datasets without any extra training. CSRA is both easy to implement and light in computations, which also enjoys intuitive explanations and visualizations.

updated: Thu Aug 05 2021 08:45:57 GMT+0000 (UTC)

published: Thu Aug 05 2021 08:45:57 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト