Align Deep Features for Oriented Object Detection

Jiaming Han; Jian Ding; Jie Li; Gui-Song Xia

指向性のあるオブジェクト検出のための深い機能の調整

過去10年間に、大規模な変動と任意の方向で分布することが多い航空画像内のオブジェクトの検出に大きな進歩がありました。ただし、既存の方法のほとんどは、異なるスケール、角度、アスペクト比のヒューリスティックに定義されたアンカーに依存しており、通常、アンカーボックスと軸に沿った畳み込みフィーチャ間の重大な不整合に悩まされ、これにより分類スコアとローカライゼーションの精度に共通の不整合が生じます。この問題に対処するために、Feature Alignment Module（FAM）とOriented Detection Module（ODM）の2つのモジュールで構成されるシングルショットアライメントネットワーク（S ^ 2A-Net）を提案します。 FAMは、アンカーリファインメントネットワークを使用して高品質のアンカーを生成し、新しいアラインメントコンボリューションを使用して、アンカーボックスに従ってコンボリューション機能を適応的に調整できます。 ODMは、最初にアクティブ回転フィルターを採用して方向情報をエンコードし、次に方向依存の機能と方向不変の機能を生成して、分類スコアとローカリゼーションの精度の不一致を緩和します。さらに、サイズの大きい画像でオブジェクトを検出する方法をさらに探求します。これにより、速度と精度のトレードオフが改善されます。大規模な実験により、この方法では、一般的に使用される2つの航空オブジェクトデータセット（DOTAとHRSC2016）で最先端のパフォーマンスを実現しながら、高い効率を維持できることが示されています。コードはhttps://github.com/csuhan/s2anetで入手できます。

The past decade has witnessed significant progress on detecting objects in aerial images that are often distributed with large scale variations and arbitrary orientations. However most of existing methods rely on heuristically defined anchors with different scales, angles and aspect ratios and usually suffer from severe misalignment between anchor boxes and axis-aligned convolutional features, which leads to the common inconsistency between the classification score and localization accuracy. To address this issue, we propose a Single-shot Alignment Network (S^2A-Net) consisting of two modules: a Feature Alignment Module (FAM) and an Oriented Detection Module (ODM). The FAM can generate high-quality anchors with an Anchor Refinement Network and adaptively align the convolutional features according to the anchor boxes with a novel Alignment Convolution. The ODM first adopts active rotating filters to encode the orientation information and then produces orientation-sensitive and orientation-invariant features to alleviate the inconsistency between classification score and localization accuracy. Besides, we further explore the approach to detect objects in large-size images, which leads to a better trade-off between speed and accuracy. Extensive experiments demonstrate that our method can achieve state-of-the-art performance on two commonly used aerial objects datasets (i.e., DOTA and HRSC2016) while keeping high efficiency. The code is available at https://github.com/csuhan/s2anet.

updated: Mon Jul 12 2021 03:26:49 GMT+0000 (UTC)

published: Fri Aug 21 2020 09:55:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト