Multi-Task Cross-Modality Attention-Fusion for 2D Object Detection

Huawei Sun; Hao Feng; Georg Stettinger; Lorenzo Servadei; Robert Wille

2D オブジェクト検出のためのマルチタスククロスモダリティアテンションフュージョン

自動運転には、正確かつ堅牢な物体検出が不可欠です。画像ベースの検出器は、悪天候時の視界不良によって引き起こされる困難に直面しています。したがって、レーダーとカメラの融合は特に興味深いものですが、異種データソースを最適に融合するには課題があります。この問題にアプローチするために、レーダーとカメラのデータをより適切に調整するための 2 つの新しいレーダー前処理手法を提案します。さらに、物体検出のためのマルチタスククロスモダリティアテンションフュージョンネットワーク (MCAF-Net) を導入します。これには 2 つの新しいフュージョンブロックが含まれます。これらにより、機能マップからの情報をより包括的に活用できるようになります。提案されたアルゴリズムは、物体を検出し、空き空間をセグメント化することで、モデルがシーンのより関連性の高い部分、つまり占有空間に焦点を当てるように導きます。私たちのアプローチは、nuScenes データセット内の現在の最先端のレーダーとカメラの融合ベースの物体検出器よりも優れたパフォーマンスを発揮し、悪天候や夜間のシナリオにおいてより堅牢な結果を実現します。

Accurate and robust object detection is critical for autonomous driving. Image-based detectors face difficulties caused by low visibility in adverse weather conditions. Thus, radar-camera fusion is of particular interest but presents challenges in optimally fusing heterogeneous data sources. To approach this issue, we propose two new radar preprocessing techniques to better align radar and camera data. In addition, we introduce a Multi-Task Cross-Modality Attention-Fusion Network (MCAF-Net) for object detection, which includes two new fusion blocks. These allow for exploiting information from the feature maps more comprehensively. The proposed algorithm jointly detects objects and segments free space, which guides the model to focus on the more relevant part of the scene, namely, the occupied space. Our approach outperforms current state-of-the-art radar-camera fusion-based object detectors in the nuScenes dataset and achieves more robust results in adverse weather conditions and nighttime scenarios.

updated: Mon Jul 17 2023 09:26:13 GMT+0000 (UTC)

published: Mon Jul 17 2023 09:26:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト