PETR: Position Embedding Transformation for Multi-View 3D Object Detection

Yingfei Liu; Tiancai Wang; Xiangyu Zhang; Jian Sun

PETR：マルチビュー3Dオブジェクト検出のための位置埋め込み変換

この論文では、マルチビュー3Dオブジェクト検出のための位置埋め込み変換（PETR）を開発します。 PETRは、3D座標の位置情報を画像の特徴にエンコードし、3Dの位置認識特徴を生成します。オブジェクトクエリは、3D位置認識機能を認識し、エンドツーエンドのオブジェクト検出を実行できます。 PETRは、標準のnuScenesデータセットで最先端のパフォーマンス（50.4％NDSおよび44.1％mAP）を達成し、ベンチマークで1位にランクされています。これは、将来の研究のためのシンプルでありながら強力なベースラインとして役立ちます。コードはhttps://github.com/megvii-research/PETRで入手できます。

In this paper, we develop position embedding transformation (PETR) for multi-view 3D object detection. PETR encodes the position information of 3D coordinates into image features, producing the 3D position-aware features. Object query can perceive the 3D position-aware features and perform end-to-end object detection. PETR achieves state-of-the-art performance (50.4% NDS and 44.1% mAP) on standard nuScenes dataset and ranks 1st place on the benchmark. It can serve as a simple yet strong baseline for future research. Code is available at https://github.com/megvii-research/PETR.

updated: Tue Jul 19 2022 08:30:57 GMT+0000 (UTC)

published: Thu Mar 10 2022 20:33:28 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト