Rotation Equivariant Feature Image Pyramid Network for Object Detection in Optical Remote Sensing Imagery

Pourya Shamsolmoali; Masoumeh Zareapoor; Jocelyn Chanussot; Huiyu Zhou; Jie Yang

光学リモートセンシング画像における物体検出のための回転同変特徴画像ピラミッドネットワーク

物体の検出は、さまざまな空中視覚ベースのアプリケーションで非常に重要です。過去数年間で、畳み込みニューラルネットワークに基づく方法は大幅に進歩しました。ただし、オブジェクトのスケール、密度、および任意の方向が多種多様であるため、現在の検出器は、事前定義された畳み込みカーネルによる小規模オブジェクトの意味的に強力な特徴の抽出に苦労しています。この問題に対処するために、回転同変畳み込みに基づく画像ピラミッドネットワークである回転同変特徴画像ピラミッドネットワーク（REFIPN）を提案します。提案されたモデルは、軽量画像ピラミッドモジュールと並列にシングルショット検出器を採用して、代表的な特徴を抽出し、最適化アプローチで関心領域を生成します。提案されたネットワークは、新しい畳み込みフィルターを使用して、幅広いスケールと方向で特徴を抽出します。これらの機能は、ベクトル場を生成し、画像上のすべての空間位置の最高スコアの方向の重みと角度を決定するために使用されます。このアプローチにより、大きなサイズのオブジェクト検出のパフォーマンスを犠牲にすることなく、小さなサイズのオブジェクト検出のパフォーマンスが向上します。提案されたモデルのパフォーマンスは、2つの一般的に使用される空中ベンチマークで検証され、結果は、提案されたモデルが満足のいく効率で最先端のパフォーマンスを達成できることを示しています。

Detection of objects is extremely important in various aerial vision-based applications. Over the last few years, the methods based on convolution neural networks have made substantial progress. However, because of the large variety of object scales, densities, and arbitrary orientations, the current detectors struggle with the extraction of semantically strong features for small-scale objects by a predefined convolution kernel. To address this problem, we propose the rotation equivariant feature image pyramid network (REFIPN), an image pyramid network based on rotation equivariance convolution. The proposed model adopts single-shot detector in parallel with a lightweight image pyramid module to extract representative features and generate regions of interest in an optimization approach. The proposed network extracts feature in a wide range of scales and orientations by using novel convolution filters. These features are used to generate vector fields and determine the weight and angle of the highest-scoring orientation for all spatial locations on an image. By this approach, the performance for small-sized object detection is enhanced without sacrificing the performance for large-sized object detection. The performance of the proposed model is validated on two commonly used aerial benchmarks and the results show our proposed model can achieve state-of-the-art performance with satisfactory efficiency.

updated: Mon Sep 06 2021 03:09:00 GMT+0000 (UTC)

published: Wed Jun 02 2021 01:33:49 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト