Move to See Better: Towards Self-Supervised Amodal Object Detection

Zhaoyuan Fang; Ayush Jain; Gabriel Sarch; Adam W. Harley; Katerina Fragkiadaki

より良いものを見るために移動：自己監視型アモーダルオブジェクト検出に向けて

人間は、環境内を移動してシーンのより有益な視点を取得することにより、世界をよりよく理解することを学びます。オブジェクト検出やセグメンテーションなどの2D視覚認識タスクのほとんどの方法は、同じシーンの画像を個々のサンプルとして扱い、複数のビューでオブジェクトの永続性を利用しません。したがって、新しいシーンやビューへの一般化には、多くの人間の注釈を使用した追加のトレーニングが必要です。この論文では、3D環境でエージェントを移動し、マルチビューRGB-D情報を集約することにより、目に見えないシナリオでオブジェクト検出器を改善するための自己監視フレームワークを提案します。事前にトレーニングされた検出器からの信頼できる2Dオブジェクト検出を投影解除し、点群で監視されていない3Dセグメンテーションを実行します。次に、セグメント化された3Dオブジェクトが他のすべてのビューに再投影され、微調整用の疑似ラベルが取得されます。屋内と屋外の両方のデータセットでの実験は、（1）私たちのフレームワークが生のRGB-Dデータと事前にトレーニングされた2D検出器から高品質の3Dセグメンテーションを実行することを示しています。（2）自己監視による微調整により、2D検出器が大幅に改善され、テスト時に見えないRGB画像が入力として提供されます。（3）自己監視付きの3D検出器のトレーニングは、同等の自己監視方式よりも大幅に優れています。

Humans learn to better understand the world by moving around their environment to get more informative viewpoints of the scene. Most methods for 2D visual recognition tasks such as object detection and segmentation treat images of the same scene as individual samples and do not exploit object permanence in multiple views. Generalization to novel scenes and views thus requires additional training with lots of human annotations. In this paper, we propose a self-supervised framework to improve an object detector in unseen scenarios by moving an agent around in a 3D environment and aggregating multi-view RGB-D information. We unproject confident 2D object detections from the pre-trained detector and perform unsupervised 3D segmentation on the point cloud. The segmented 3D objects are then re-projected to all other views to obtain pseudo-labels for fine-tuning. Experiments on both indoor and outdoor datasets show that (1) our framework performs high-quality 3D segmentation from raw RGB-D data and a pre-trained 2D detector; (2) fine-tuning with self-supervision improves the 2D detector significantly where an unseen RGB image is given as input at test time; (3) training a 3D detector with self-supervision outperforms a comparable self-supervised method by a large margin.

updated: Mon Nov 30 2020 19:16:51 GMT+0000 (UTC)

published: Mon Nov 30 2020 19:16:51 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト