Meta Faster R-CNN: Towards Accurate Few-Shot Object Detection with Attentive Feature Alignment

Guangxing Han; Shiyuan Huang; Jiawei Ma; Yicheng He; Shih-Fu Chang

Meta Faster R-CNN：注意深い特徴の位置合わせによる正確な少数ショットオブジェクト検出に向けて

少数ショットオブジェクト検出（FSOD）は、わずかな例を使用してオブジェクトを検出することを目的としています。これは多くの実用的なアプリケーションに非常に必要ですが、これまでのところ困難なままです。データが豊富な基本クラスから学習したメタ知識をデータが不足している新規クラスに転送することにより、メタ学習ベースの数ショットオブジェクト検出方法を提案します。私たちの方法は、プロポーザルベースのオブジェクト検出フレームワークに粗いアプローチから細かいアプローチを組み込み、プロトタイプベースの分類器をプロポーザルの生成段階と分類段階の両方に統合します。数ショットの小説クラスの提案生成を改善するために、従来のオブジェクト/非オブジェクト分類器の代わりに、クエリ画像特徴マップの各空間位置と空間的にプールされたクラス特徴の間の類似性を測定する軽量マッチングネットワークを学習することを提案します。カテゴリ固有の提案を生成し、新しいクラスの提案の想起を改善します。生成された提案と数ショットのクラスの例との間の空間的な不整合に対処するために、新しい注意深い特徴の位置合わせ方法を提案し、それによって数ショットのオブジェクト検出のパフォーマンスを向上させます。一方、基本クラス用のFasterR-CNN検出ヘッドを共同で学習します。複数のFSODベンチマークで実施された広範な実験は、提案されたアプローチが（増分）数ショット学習設定の下で最先端の結果を達成することを示しています。

Few-shot object detection (FSOD) aims to detect objects using only few examples. It's critically needed for many practical applications but so far remains challenging. We propose a meta-learning based few-shot object detection method by transferring meta-knowledge learned from data-abundant base classes to data-scarce novel classes. Our method incorporates a coarse-to-fine approach into the proposal based object detection framework and integrates prototype based classifiers into both the proposal generation and classification stages. To improve proposal generation for few-shot novel classes, we propose to learn a lightweight matching network to measure the similarity between each spatial position in the query image feature map and spatially-pooled class features, instead of the traditional object/nonobject classifier, thus generating category-specific proposals and improving proposal recall for novel classes. To address the spatial misalignment between generated proposals and few-shot class examples, we propose a novel attentive feature alignment method, thus improving the performance of few-shot object detection. Meanwhile we jointly learn a Faster R-CNN detection head for base classes. Extensive experiments conducted on multiple FSOD benchmarks show our proposed approach achieves state of the art results under (incremental) few-shot learning settings.

updated: Thu Apr 15 2021 19:01:27 GMT+0000 (UTC)

published: Thu Apr 15 2021 19:01:27 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト