Cos R-CNN for Online Few-shot Object Detection

Gratianus Wesley Putra Data; Henry Howard-Jenkins; David Murray; Victor Prisacariu

オンラインの少数ショット物体検出のための Cos R-CNN

我々は、オンラインの少数ショットの物体検出用に設計された単純なサンプルベースの R-CNN 定式化である Cos R-CNN を提案します。つまり、微調整することなく、例がほとんどなく、画像内の新しいオブジェクトカテゴリを位置特定して分類することができます。比較学習タスクとしての Cos R-CNN フレーム検出: 目に見えないクラスは見本画像として表され、オブジェクトはこれらの見本との類似性に基づいて検出されます。コサインベースの分類ヘッドにより、分類パラメーターを模範埋め込みに動的に適応させることができ、距離メトリックのハイパーパラメーターを手動で調整することなく、埋め込み空間での類似クラスのクラスタリングが促進されます。このシンプルな定式化は、最近提案された 5 ウェイ ImageNet 少数ショット検出ベンチマークで最高の結果を達成し、オンライン 1/5/10 ショットシナリオを 8/3/1% 以上上回っており、パフォーマンスも最大 20% 向上しています。新規クラスのすべてのショットにわたるオンライン 20 ウェイの少数ショット VOC で。

We propose Cos R-CNN, a simple exemplar-based R-CNN formulation that is designed for online few-shot object detection. That is, it is able to localise and classify novel object categories in images with few examples without fine-tuning. Cos R-CNN frames detection as a learning-to-compare task: unseen classes are represented as exemplar images, and objects are detected based on their similarity to these exemplars. The cosine-based classification head allows for dynamic adaptation of classification parameters to the exemplar embedding, and encourages the clustering of similar classes in embedding space without the need for manual tuning of distance-metric hyperparameters. This simple formulation achieves best results on the recently proposed 5-way ImageNet few-shot detection benchmark, beating the online 1/5/10-shot scenarios by more than 8/3/1%, as well as performing up to 20% better in online 20-way few-shot VOC across all shots on novel classes.

updated: Tue Jul 25 2023 13:22:24 GMT+0000 (UTC)

published: Tue Jul 25 2023 13:22:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト