Open-Set Object Detection Using Classification-free Object Proposal and Instance-level Contrastive Learning with Appendix

Zhongxiang Zhou; Yifei Yang; Yue Wang; Rong Xiong

分類のないオブジェクト提案とインスタンスレベルの対照学習を使用したオープンセットオブジェクト検出 (付録付き)

既知のオブジェクトと未知のオブジェクトの両方を検出することは、構造化されていない環境でロボットを操作するための基本的なスキルです。オープンセットオブジェクト検出 (OSOD) は、オブジェクトと背景の分離、およびオープンセットオブジェクトの分類という 2 つのサブタスクで構成される問題を処理するための有望な方向性です。このホワイトペーパーでは、困難な OSOD に対処する Openset RCNN を紹介します。最初のサブタスクで未知のオブジェクトと背景を明確にするために、分類のない領域提案ネットワーク (CF-RPN) を使用することを提案します。これは、オブジェクトの位置と形状からの手がかりを純粋に使用して各領域のオブジェクト性スコアを推定し、トレーニングカテゴリへの過適合を防ぎます。 2 番目のサブタスクで未知のオブジェクトを識別するために、プロトタイプ学習ネットワーク (PLN) によって達成される潜在空間内の既知のカテゴリの相補領域を使用してそれらを表すことを提案します。 PLN は、インスタンスレベルの対照学習を実行して提案を潜在空間にエンコードし、既知の各カテゴリのプロトタイプを中心とするコンパクトな領域を構築します。さらに、未知のオブジェクトの検出パフォーマンスは、一般的に使用されるオブジェクト検出データセットが完全に注釈付けされていない状況では、公平に評価できないことに注意してください。したがって、新しいベンチマークは、完全な注釈を備えたロボット把持姿勢検出データセットである GraspNet-1billion を再編成することによって導入されます。広範な実験により、私たちの方法のメリットが実証されています。最後に、私たちの Openset RCNN が、雑然とした環境でのロボットの再配置タスクをサポートするためのオープンセット認識能力をロボットに与えることができることを示します。詳細については、https://sites.google.com/view/openest-rcnn/ をご覧ください。

Detecting both known and unknown objects is a fundamental skill for robot manipulation in unstructured environments. Open-set object detection (OSOD) is a promising direction to handle the problem consisting of two subtasks: objects and background separation, and open-set object classification. In this paper, we present Openset RCNN to address the challenging OSOD. To disambiguate unknown objects and background in the first subtask, we propose to use classification-free region proposal network (CF-RPN) which estimates the objectness score of each region purely using cues from object's location and shape preventing overfitting to the training categories. To identify unknown objects in the second subtask, we propose to represent them using the complementary region of known categories in a latent space which is accomplished by a prototype learning network (PLN). PLN performs instance-level contrastive learning to encode proposals to a latent space and builds a compact region centering with a prototype for each known category. Further, we note that the detection performance of unknown objects can not be unbiasedly evaluated on the situation that commonly used object detection datasets are not fully annotated. Thus, a new benchmark is introduced by reorganizing GraspNet-1billion, a robotic grasp pose detection dataset with complete annotation. Extensive experiments demonstrate the merits of our method. We finally show that our Openset RCNN can endow the robot with an open-set perception ability to support robotic rearrangement tasks in cluttered environments. More details can be found in https://sites.google.com/view/openest-rcnn/

updated: Mon Nov 21 2022 15:00:04 GMT+0000 (UTC)

published: Mon Nov 21 2022 15:00:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト