AttDLNet: Attention-based DL Network for 3D LiDAR Place Recognition

Tiago Barros; Luís Garrote; Ricardo Pereira; Cristiano Premebida; Urbano J. Nunes

AttDLNet：3DLiDAR場所認識のための注意ベースのDLネットワーク

LiDARベースの場所認識は、自動運転車およびロボット工学アプリケーションにおけるSLAMおよびグローバルローカリゼーションの重要なコンポーネントの1つです。 3D LiDARから有用な情報を学習するDLアプローチの成功により、場所の認識もこのモダリティの恩恵を受け、特に条件が大きく変化する環境で、より高い再ローカリゼーションとループクロージャ検出のパフォーマンスにつながりました。この分野での進歩にもかかわらず、条件や向きの変化に対して不変である3DLiDARデータからの適切で効率的な記述子の抽出は未解決の課題です。この問題に対処するために、この作業では、点群の範囲ベースのプロキシ表現と、スタックされた注意レイヤーを備えた注意ネットワークを使用して、長距離コンテキストと-機能の関係。提案されたネットワークは、KITTIデータセットでトレーニングおよび検証され、アブレーション研究が提示されて、新しい注意ネットワークを評価します。結果は、ネットワークに注意を追加するとパフォーマンスが向上し、効率的なループクロージャにつながり、確立された3DLiDARベースの場所認識アプローチよりも優れていることを示しています。アブレーションの研究から、結果は、中間のエンコーダー層が最高の平均性能を持ち、より深い層は方向の変化に対してよりロバストであることを示しています。コードはhttps://github.com/Cybonic/AttDLNetで公開されています

LiDAR-based place recognition is one of the key components of SLAM and global localization in autonomous vehicles and robotics applications. With the success of DL approaches in learning useful information from 3D LiDARs, place recognition has also benefited from this modality, which has led to higher re-localization and loop-closure detection performance, particularly, in environments with significant changing conditions. Despite the progress in this field, the extraction of proper and efficient descriptors from 3D LiDAR data that are invariant to changing conditions and orientation is still an unsolved challenge. To address this problem, this work proposes a novel 3D LiDAR-based deep learning network (named AttDLNet) that uses a range-based proxy representation for point clouds and an attention network with stacked attention layers to selectively focus on long-range context and inter-feature relationships. The proposed network is trained and validated on the KITTI dataset and an ablation study is presented to assess the novel attention network. Results show that adding attention to the network improves performance, leading to efficient loop closures, and outperforming an established 3D LiDAR-based place recognition approach. From the ablation study, results indicate that the middle encoder layers have the highest mean performance, while deeper layers are more robust to orientation change. The code is publicly available at https://github.com/Cybonic/AttDLNet

updated: Wed Jan 04 2023 12:21:40 GMT+0000 (UTC)

published: Thu Jun 17 2021 16:34:37 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト