Keypoints-Based Deep Feature Fusion for Cooperative Vehicle Detection of Autonomous Driving

Yunshuang Yuan; Hao Cheng; Monika Sester

自動運転の協調車両検出のためのキーポイントベースの深部特徴融合

自動運転の知覚精度と安全性を向上させるために、車両間で集合的知覚メッセージ（CPM）を共有することで、オクルージョンを低減することが調査されています。ただし、特に接続された自動運転車間でリアルタイム通信が必要な場合は、高精度のデータ共有と低い通信オーバーヘッドが集合的な認識にとって大きな課題になります。本論文では、集合知覚のために、Fusion PV-RCNN（略してFPV-RCNN）と呼ばれる3Dオブジェクト検出器PV-RCNN上に構築された効率的かつ効果的なキーポイントベースの深部特徴融合フレームワークを提案します。 CPMサイズを圧縮し、複数車両のデータ融合問題を解決するために、高性能のバウンディングボックス提案マッチングモジュールとキーポイント選択戦略を紹介します。さらに、データ融合のロバスト性を高めるために、最大コンセンサス原理に基づく効果的なローカリゼーションエラー訂正モジュールも提案します。鳥瞰図（BEV）のキーポイント機能の融合と比較して、FPV-RCNNは、集合知覚専用の合成データセットCOMAPの高い評価基準（IoU 0.7）で、検出精度を約9％向上させます。さらに、そのパフォーマンスは、共有時にデータが失われない2つの生データ融合ベースラインに匹敵します。さらに、私たちの方法では、CPMサイズが0.3 KB未満に大幅に削減されるため、以前の作業で使用されたBEV機能マップ共有の約50分の1になります。 CPM機能チャネルがさらに減少した場合、つまり128から32になった場合でも、検出パフォーマンスは明らかな低下を示しません。このメソッドのコードは、https：//github.com/YuanYunshuang/FPV_RCNNで入手できます。

Sharing collective perception messages (CPM) between vehicles is investigated to decrease occlusions so as to improve the perception accuracy and safety of autonomous driving. However, highly accurate data sharing and low communication overhead is a big challenge for collective perception, especially when real-time communication is required among connected and automated vehicles. In this paper, we propose an efficient and effective keypoints-based deep feature fusion framework built on the 3D object detector PV-RCNN, called Fusion PV-RCNN (FPV-RCNN for short), for collective perception. We introduce a high-performance bounding box proposal matching module and a keypoints selection strategy to compress the CPM size and solve the multi-vehicle data fusion problem. Besides, we also propose an effective localization error correction module based on the maximum consensus principle to increase the robustness of the data fusion. Compared to a bird's-eye view (BEV) keypoints feature fusion, FPV-RCNN achieves improved detection accuracy by about 9% at a high evaluation criterion (IoU 0.7) on the synthetic dataset COMAP dedicated to collective perception. In addition, its performance is comparable to two raw data fusion baselines that have no data loss in sharing. Moreover, our method also significantly decreases the CPM size to less than 0.3 KB, and is thus about 50 times smaller than the BEV feature map sharing used in previous works. Even with further decreased CPM feature channels, i.e., from 128 to 32, the detection performance does not show apparent drops. The code of our method is available at https://github.com/YuanYunshuang/FPV_RCNN.

updated: Tue Feb 15 2022 09:24:49 GMT+0000 (UTC)

published: Thu Sep 23 2021 19:41:02 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト