AFD-Net: Adaptive Fully-Dual Network for Few-Shot Object Detection

Longyao Liu; Bo Ma; Yulin Zhang; Xin Yi; Haozhi Li

AFD-Net：少数のショットオブジェクト検出のための適応型フルデュアルネットワーク

少数ショットオブジェクト検出（FSOD）は、注釈付きの例がほとんどない、これまでに見られなかったオブジェクトにすばやく適応できる検出器を学習することを目的としています。既存の方法は、検出器の共有コンポーネント（RoIヘッドなど）を利用して分類とローカリゼーションのサブタスクを実行することでこの問題を解決しますが、機能の埋め込みに向けて2つのサブタスクの明確な好みを考慮に入れる方法はほとんどありません。この論文では、FSODの特性を注意深く分析し、一般的な数ショット検出器が2つのサブタスクの明示的な分解を考慮し、両方からの情報を活用して特徴表現を強化する必要があることを示します。最後に、シンプルでありながら効果的なアダプティブフルデュアルネットワーク（AFD-Net）を提案します。具体的には、個別の特徴抽出用のデュアルクエリエンコーダーとデュアルアテンションジェネレーター、および個別のモデルの再重み付け用のデュアルアグリゲーターを導入することにより、FasterR-CNNを拡張します。自発的に、個別の状態推定がR-CNN検出器によって実現されます。さらに、強化された特徴表現を取得するために、さまざまなサブタスクで特徴融合を適応的に実行するための適応融合メカニズムをさらに導入します。さまざまな設定でのPASCALVOCとMSCOCOの広範な実験は、私たちの方法が大幅に新しい最先端のパフォーマンスを達成し、その有効性と一般化能力を実証していることを示しています。

Few-shot object detection (FSOD) aims at learning a detector that can fast adapt to previously unseen objects with scarce annotated examples, which is challenging and demanding. Existing methods solve this problem by performing subtasks of classification and localization utilizing a shared component (e.g., RoI head) in the detector, yet few of them take the distinct preferences of two subtasks towards feature embedding into consideration. In this paper, we carefully analyze the characteristics of FSOD, and present that a general few-shot detector should consider the explicit decomposition of two subtasks, as well as leveraging information from both of them to enhance feature representations. To the end, we propose a simple yet effective Adaptive Fully-Dual Network (AFD-Net). Specifically, we extend Faster R-CNN by introducing Dual Query Encoder and Dual Attention Generator for separate feature extraction, and Dual Aggregator for separate model reweighting. Spontaneously, separate state estimation is achieved by the R-CNN detector. Besides, for the acquisition of enhanced feature representations, we further introduce Adaptive Fusion Mechanism to adaptively perform feature fusion in different subtasks. Extensive experiments on PASCAL VOC and MS COCO in various settings show that, our method achieves new state-of-the-art performance by a large margin, demonstrating its effectiveness and generalization ability.

updated: Mon Apr 19 2021 07:29:07 GMT+0000 (UTC)

published: Mon Nov 30 2020 10:21:32 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト