Tucker Bilinear Attention Network for Multi-scale Remote Sensing Object Detection

Tao Chen; Ruirui Li; Jiafeng Fu; Daguang Jiang

マルチスケールリモートセンシングオブジェクト検出のためのTucker Bilinear Attention Network

VHR リモートセンシング画像でのオブジェクト検出は、都市計画、土地資源管理、救助任務などのアプリケーションで重要な役割を果たします。リモートセンシングターゲットの大規模なバリエーションは、VHR リモートセンシングオブジェクト検出の主な課題の 1 つです。既存の方法は、特徴ピラミッドの構造を改善し、さまざまな注意モジュールを採用することにより、高解像度のリモートセンシングオブジェクトの検出精度を向上させます。ただし、小さなターゲットの場合、重要な詳細機能が失われるため、依然として重大な検出漏れがあります。マルチスケール機能の融合とバランスには、まだ改善の余地があります。この問題に対処するために、この論文では 2 つの新しいモジュールを提案します: ガイド付き注意とタッカー双線形注意で、それぞれ早期融合と後期融合の段階に適用されます。前者はクリーンで重要な詳細機能を効果的に保持でき、後者はセマンティックレベルの相関マイニングを通じて機能のバランスを改善できます。 2 つのモジュールに基づいて、新しいマルチスケールリモートセンシングオブジェクト検出フレームワークを構築します。ベルとホイッスルはありません。提案された方法は、小さなオブジェクトの平均精度を大幅に改善し、DOTA、DIOR、および NWPU VHR-10 の 9 つの最先端の方法と比較して、最高の平均平均精度を達成します。コードとモデルは https:// で入手できます。 github.com/Shinichict/GTNet。

Object detection on VHR remote sensing images plays a vital role in applications such as urban planning, land resource management, and rescue missions. The large-scale variation of the remote-sensing targets is one of the main challenges in VHR remote-sensing object detection. Existing methods improve the detection accuracy of high-resolution remote sensing objects by improving the structure of feature pyramids and adopting different attention modules. However, for small targets, there still be seriously missed detections due to the loss of key detail features. There is still room for improvement in the way of multiscale feature fusion and balance. To address this issue, this paper proposes two novel modules: Guided Attention and Tucker Bilinear Attention, which are applied to the stages of early fusion and late fusion respectively. The former can effectively retain clean key detail features, and the latter can better balance features through semantic-level correlation mining. Based on two modules, we build a new multi-scale remote sensing object detection framework. No bells and whistles. The proposed method largely improves the average precisions of small objects and achieves the highest mean average precisions compared with 9 state-of-the-art methods on DOTA, DIOR, and NWPU VHR-10.Code and models are available at https://github.com/Shinichict/GTNet.

updated: Sun May 28 2023 06:39:19 GMT+0000 (UTC)

published: Thu Mar 09 2023 15:20:03 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト