HyperDet3D: Learning a Scene-conditioned 3D Object Detector

Yu Zheng; Yueqi Duan; Jiwen Lu; Jie Zhou; Qi Tian

HyperDet3D：シーン条件付き3Dオブジェクト検出器の学習

図書館の浴槽、オフィスの流し台、洗濯室のベッド-直感に反して、シーンは3Dオブジェクト検出の重要な事前知識を提供し、類似したオブジェクトのあいまいな検出を排除するように指示します。この論文では、HyperDet3Dを提案して、3Dオブジェクト検出のためのシーン条件付きの事前知識を調査します。既存の方法は、シーン条件付きの知識がなくても、ローカル要素とそれらの関係をより適切に表現するように努めています。これにより、個々のポイントとオブジェクト候補の理解だけに基づいてあいまいさが生じる可能性があります。代わりに、HyperDet3Dは、シーンに依存しない埋め込みとシーン固有の知識を、シーン条件付きハイパーネットワークを通じて同時に学習します。より具体的には、HyperDet3Dは、さまざまな3Dシーンから共有可能な要約を探索するだけでなく、テスト時に検出器を特定のシーンに適合させます。シーン条件付き知識の融合を条件とする検出器のレイヤーパラメータを動的に制御するために、識別可能なマルチヘッドシーン固有の注意（MSA）モジュールを提案します。 HyperDet3Dは、ScanNetおよびSUNRGB-Dデータセットの3Dオブジェクト検出ベンチマークで最先端の結果を実現します。さらに、クロスデータセット評価を通じて、取得したシーン条件付きの事前知識が、ドメインギャップのある3Dシーンに直面した場合でも有効であることを示します。

A bathtub in a library, a sink in an office, a bed in a laundry room -- the counter-intuition suggests that scene provides important prior knowledge for 3D object detection, which instructs to eliminate the ambiguous detection of similar objects. In this paper, we propose HyperDet3D to explore scene-conditioned prior knowledge for 3D object detection. Existing methods strive for better representation of local elements and their relations without scene-conditioned knowledge, which may cause ambiguity merely based on the understanding of individual points and object candidates. Instead, HyperDet3D simultaneously learns scene-agnostic embeddings and scene-specific knowledge through scene-conditioned hypernetworks. More specifically, our HyperDet3D not only explores the sharable abstracts from various 3D scenes, but also adapts the detector to the given scene at test time. We propose a discriminative Multi-head Scene-specific Attention (MSA) module to dynamically control the layer parameters of the detector conditioned on the fusion of scene-conditioned knowledge. Our HyperDet3D achieves state-of-the-art results on the 3D object detection benchmark of the ScanNet and SUN RGB-D datasets. Moreover, through cross-dataset evaluation, we show the acquired scene-conditioned prior knowledge still takes effect when facing 3D scenes with domain gap.

updated: Tue Apr 12 2022 07:57:58 GMT+0000 (UTC)

published: Tue Apr 12 2022 07:57:58 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト