ODAM: Object Detection, Association, and Mapping using Posed RGB Video

Kejie Li; Daniel DeTone; Steven Chen; Minh Vo; Ian Reid; Hamid Rezatofighi; Chris Sweeney; Julian Straub; Richard Newcombe

ODAM：Posed RGBビデオを使用したオブジェクトの検出、関連付け、マッピング

オブジェクトのローカライズとその範囲の3Dでの推定は、拡張現実とロボット工学で多くのアプリケーションがある高レベルの3Dシーン理解に向けた重要なステップです。ポーズをとったRGBビデオを使用した3Dオブジェクト検出、関連付け、マッピングのシステムであるODAMを紹介します。提案されたシステムは、ディープラーニングフロントエンドに依存して、特定のRGBフレームから3Dオブジェクトを検出し、グラフニューラルネットワーク（GNN）を使用してそれらをグローバルオブジェクトベースのマップに関連付けます。これらのフレームとモデルの関連付けに基づいて、バックエンドは、マルチビュージオメトリの制約と以前のオブジェクトスケールの下で、スーパー2次曲面として表されるオブジェクトバウンディングボリュームを最適化します。 ScanNetで提案されたシステムを検証し、既存のRGBのみの方法に比べて大幅な改善を示します。

Localizing objects and estimating their extent in 3D is an important step towards high-level 3D scene understanding, which has many applications in Augmented Reality and Robotics. We present ODAM, a system for 3D Object Detection, Association, and Mapping using posed RGB videos. The proposed system relies on a deep learning front-end to detect 3D objects from a given RGB frame and associate them to a global object-based map using a graph neural network (GNN). Based on these frame-to-model associations, our back-end optimizes object bounding volumes, represented as super-quadrics, under multi-view geometry constraints and the object scale prior. We validate the proposed system on ScanNet where we show a significant improvement over existing RGB-only methods.

updated: Mon Aug 23 2021 13:28:10 GMT+0000 (UTC)

published: Mon Aug 23 2021 13:28:10 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト