AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object Detection

Zehui Chen; Zhenyu Li; Shiquan Zhang; Liangji Fang; Qinghong Jiang; Feng Zhao; Bolei Zhou; Hang Zhao

AutoAlign：マルチモーダル3Dオブジェクト検出のためのピクセルインスタンス機能集約

RGB画像またはLiDARポイントクラウドのいずれかによるオブジェクト検出は、自動運転で広く研究されてきました。ただし、これら2つのデータソースを相互に補完的かつ有益なものにすることは依然として困難です。本論文では、3Dオブジェクト検出のための自動特徴融合戦略であるAutoAlignを提案する。カメラの射影行列との決定論的な対応を確立する代わりに、学習可能なアライメントマップを使用して画像と点群の間のマッピング関係をモデル化します。このマップにより、モデルは、動的でデータ駆動型の方法で不均一な機能の配置を自動化できます。具体的には、各ボクセルのピクセルレベルの画像特徴を適応的に集約するために、クロスアテンション特徴アラインメントモジュールが考案されています。機能の調整中のセマンティックの一貫性を強化するために、モデルがインスタンスレベルの機能ガイダンスを使用して機能の集約を学習できる、自己監視型のクロスモーダル機能相互作用モジュールも設計します。広範な実験結果は、私たちのアプローチがKITTIおよびnuScenesデータセットでそれぞれ2.3mAPおよび7.0mAPの改善につながる可能性があることを示しています。特に、私たちの最高のモデルは、nuScenesテストリーダーボードで70.9 NDSに達し、さまざまな最先端の製品間で競争力のあるパフォーマンスを実現しています。

Object detection through either RGB images or the LiDAR point clouds has been extensively explored in autonomous driving. However, it remains challenging to make these two data sources complementary and beneficial to each other. In this paper, we propose AutoAlign, an automatic feature fusion strategy for 3D object detection. Instead of establishing deterministic correspondence with camera projection matrix, we model the mapping relationship between the image and point clouds with a learnable alignment map. This map enables our model to automate the alignment of non-homogenous features in a dynamic and data-driven manner. Specifically, a cross-attention feature alignment module is devised to adaptively aggregate pixel-level image features for each voxel. To enhance the semantic consistency during feature alignment, we also design a self-supervised cross-modal feature interaction module, through which the model can learn feature aggregation with instance-level feature guidance. Extensive experimental results show that our approach can lead to 2.3 mAP and 7.0 mAP improvements on the KITTI and nuScenes datasets, respectively. Notably, our best model reaches 70.9 NDS on the nuScenes testing leaderboard, achieving competitive performance among various state-of-the-arts.

updated: Mon Jan 17 2022 16:08:57 GMT+0000 (UTC)

published: Mon Jan 17 2022 16:08:57 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト