Learning Calibrated-Guidance for Object Detection in Aerial Images

Zongqi Wei; Dong Liang; Dong Zhang; Qixiang Geng; Liyan Zhang; Han Sun; Huiyu Zhou; Mingqiang Wei; Pan Gao

航空写真での物体検出のためのキャリブレーションされたガイダンスの学習

物体検出は、コンピュータビジョンの分野で最も基本的でありながら挑戦的な研究トピックの1つです。最近、航空写真でのこのトピックに関する研究は、途方もない進歩を遂げました。ただし、複雑な背景とより悪い画像品質は、空中物体検出における明らかな問題です。ほとんどの最先端のアプローチは、チャネルごとの特徴キャリブレーションの重要性を驚くほど無視しながら、困難な計算の複雑さを伴う時空間特徴キャリブレーションのための精巧な注意メカニズムを開発する傾向があります。この作業では、グローバルな特徴親和性相関に基づいて各チャネルのキャリブレーションの重みを適応的に決定できる、特徴トランスフォーマー方式でチャネル通信を強化するためのシンプルで効果的なキャリブレーションガイダンス（CG）スキームを提案します。具体的には、特定の特徴マップのセットについて、CGは最初に各チャネルと残りのチャネル間の特徴の類似性を中間のキャリブレーションガイダンスとして計算します。次に、ガイダンス操作を介して重み付けされたすべてのチャネルを集約することにより、各チャネルを再表現します。私たちのCGは、CG-Netと呼ばれる任意のディープニューラルネットワークに接続できる一般的なモジュールです。その有効性と効率を実証するために、航空写真の方向付けられたオブジェクト検出タスクと水平方向のオブジェクト検出タスクの両方で広範な実験が実行されます。 2つの挑戦的なベンチマーク（DOTAとHRSC2016）での実験結果は、当社のCG-Netがかなりの計算オーバーヘッドで新しい最先端のパフォーマンスを正確に達成できることを示しています。ソースコードはhttps://github.com/WeiZongqi/CG-Netでオープンソース化されています

Object detection is one of the most fundamental yet challenging research topics in the domain of computer vision. Recently, the study on this topic in aerial images has made tremendous progress. However, complex background and worse imaging quality are obvious problems in aerial object detection. Most state-of-the-art approaches tend to develop elaborate attention mechanisms for the space-time feature calibrations with arduous computational complexity, while surprisingly ignoring the importance of feature calibrations in channel-wise. In this work, we propose a simple yet effective Calibrated-Guidance (CG) scheme to enhance channel communications in a feature transformer fashion, which can adaptively determine the calibration weights for each channel based on the global feature affinity correlations. Specifically, for a given set of feature maps, CG first computes the feature similarity between each channel and the remaining channels as the intermediary calibration guidance. Then, re-representing each channel by aggregating all the channels weighted together via the guidance operation. Our CG is a general module that can be plugged into any deep neural networks, which is named as CG-Net. To demonstrate its effectiveness and efficiency, extensive experiments are carried out on both oriented object detection task and horizontal object detection task in aerial images. Experimental results on two challenging benchmarks (DOTA and HRSC2016) demonstrate that our CG-Net can achieve the new state-of-the-art performance in accuracy with a fair computational overhead. The source code has been open sourced at https://github.com/WeiZongqi/CG-Net

updated: Tue Dec 14 2021 06:21:12 GMT+0000 (UTC)

published: Sun Mar 21 2021 13:55:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト