CrossFusion: Interleaving Cross-modal Complementation for Noise-resistant 3D Object Detection

Yang Yang; Weijie Ma; Hao Chen; Linlin Ou; Xinyi Yu

CrossFusion: ノイズに強い 3D オブジェクト検出のためのインターリーブクロスモーダル補完

最近の研究によると、LiDAR とカメラモダリティの組み合わせは、3D オブジェクト検出に必要であり、典型的であることが証明されています。既存の融合戦略は、本質的にLiDARモーダルに過度に依存する傾向があり、カメラセンサーからの豊富なセマンティクスを十分に活用していません.ただし、既存の方法は、LiDAR 機能の破損により大きなドメインギャップが生じるため、他のモダリティからの情報に依存することはできません。これに続いて、設計されたクロスモーダル補完戦略でカメラと LiDAR 機能を最大限に活用する、より堅牢でノイズに強いスキームである CrossFusion を提案します。私たちが実施した広範な実験は、私たちの方法が、追加の深度推定ネットワークを導入することなく、設定の下で最先端の方法よりも優れているだけでなく、5.2％増加することにより、特定の誤動作シナリオの再トレーニングなしでモデルのノイズ耐性を実証することを示していますmAP および 2.4% NDS。

The combination of LiDAR and camera modalities is proven to be necessary and typical for 3D object detection according to recent studies. Existing fusion strategies tend to overly rely on the LiDAR modal in essence, which exploits the abundant semantics from the camera sensor insufficiently. However, existing methods cannot rely on information from other modalities because the corruption of LiDAR features results in a large domain gap. Following this, we propose CrossFusion, a more robust and noise-resistant scheme that makes full use of the camera and LiDAR features with the designed cross-modal complementation strategy. Extensive experiments we conducted show that our method not only outperforms the state-of-the-art methods under the setting without introducing an extra depth estimation network but also demonstrates our model's noise resistance without re-training for the specific malfunction scenarios by increasing 5.2% mAP and 2.4% NDS.

updated: Wed Apr 19 2023 14:35:16 GMT+0000 (UTC)

published: Wed Apr 19 2023 14:35:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト