FrustumFormer: Adaptive Instance-aware Resampling for Multi-view 3D Detection

Yuqi Wang; Yuntao Chen; Zhaoxiang Zhang

FrustumFormer: マルチビュー 3D 検出のための適応型インスタンス認識リサンプリング

2D 透視空間から 3D 空間への特徴の変換は、マルチビュー 3D オブジェクト検出に不可欠です。最近のアプローチは主にビュー変換の設計に焦点を当てており、遠近ビューの特徴を推定された深度で 3D 空間にピクセル単位で持ち上げるか、3D 投影を介してグリッド単位で BEV 機能を構築し、すべてのピクセルまたはグリッドを均等に扱います。ただし、何を変換するかを選択することも重要ですが、これまでほとんど議論されていませんでした。動いている車のピクセルは、空のピクセルよりも有益です。画像に含まれる情報を十分に活用するために、ビューの変換は、その内容に応じてさまざまな画像領域に適応できる必要があります。この論文では、FrustumFormer という名前の新しいフレームワークを提案します。これは、適応型インスタンス認識リサンプリングを介してインスタンス領域の機能により注意を払います。具体的には、モデルは、イメージビューオブジェクトの提案を利用して、鳥瞰図のインスタンスフラスタムを取得します。インスタンス錐台内の適応占有マスクが学習され、インスタンスの位置が改善されます。さらに、時間錐台の交差により、オブジェクトのローカリゼーションの不確実性をさらに減らすことができます。 nuScenes データセットでの包括的な実験により、FrustumFormer の有効性が実証され、ベンチマークで新しい最先端のパフォーマンスが達成されました。コードは近日公開予定です。

The transformation of features from 2D perspective space to 3D space is essential to multi-view 3D object detection. Recent approaches mainly focus on the design of view transformation, either pixel-wisely lifting perspective view features into 3D space with estimated depth or grid-wisely constructing BEV features via 3D projection, treating all pixels or grids equally. However, choosing what to transform is also important but has rarely been discussed before. The pixels of a moving car are more informative than the pixels of the sky. To fully utilize the information contained in images, the view transformation should be able to adapt to different image regions according to their contents. In this paper, we propose a novel framework named FrustumFormer, which pays more attention to the features in instance regions via adaptive instance-aware resampling. Specifically, the model obtains instance frustums on the bird's eye view by leveraging image view object proposals. An adaptive occupancy mask within the instance frustum is learned to refine the instance location. Moreover, the temporal frustum intersection could further reduce the localization uncertainty of objects. Comprehensive experiments on the nuScenes dataset demonstrate the effectiveness of FrustumFormer, and we achieve a new state-of-the-art performance on the benchmark. Codes will be released soon.

updated: Tue Jan 10 2023 17:51:55 GMT+0000 (UTC)

published: Tue Jan 10 2023 17:51:55 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト