Efficient 3D Point Cloud Feature Learning for Large-Scale Place Recognition

Le Hui; Mingmei Cheng; Jin Xie; Jian Yang

大規模な場所認識のための効率的な3D点群特徴学習

場所認識のための点群ベースの検索は、変化する環境でのシーンの劇的な外観と照明の変化のために、依然として困難な問題です。検索タスク用の既存の深層学習ベースのグローバル記述子は、通常、大量の計算リソース（メモリなど）を消費します。これは、ハードウェアリソースが限られている場合には適さない場合があります。本論文では、効率的な点群学習ネットワーク（EPC-Net）を開発して、視覚的な場所認識のためのグローバル記述子を形成します。これにより、優れたパフォーマンスが得られ、計算メモリと推論時間が短縮されます。まず、点群の局所的な幾何学的特徴を集約するために、ProxyConvと呼ばれる軽量で効果的なニューラルネットワークモジュールを提案します。空間隣接行列とプロキシポイントを活用して、元のエッジ畳み込みを単純化し、メモリ消費量を削減します。次に、軽量のグループ化されたVLADネットワーク（G-VLAD）を設計して、取得用のグローバル記述子を形成します。元のVLADネットワークと比較して、高次元ベクトルを低次元ベクトルのグループに分解するために、グループ化された完全接続（GFC）層を提案します。これにより、ネットワークのパラメーターの数を減らし、特徴の識別を維持できます。ベクター。最後に、推論時間をさらに短縮するために、EPC-Net-Lと呼ばれるEPC-Netの単純なバージョンを開発します。これは、2つのProxyConvモジュールと1つの最大プーリングレイヤーで構成され、グローバル記述子を集約します。 EPC-Netから知識を抽出することにより、EPC-Net-Lは検索用の識別可能なグローバル記述子を取得できます。オックスフォードデータセットと3つの社内データセットでの広範な実験は、提案された方法が、より低いパラメーター、FLOP、およびフレームあたりのランタイムで最先端のパフォーマンスを達成できることを示しています。

Point cloud based retrieval for place recognition is still a challenging problem due to drastic appearance and illumination changes of scenes in changing environments. Existing deep learning based global descriptors for the retrieval task usually consume a large amount of computation resources (e.g., memory), which may not be suitable for the cases of limited hardware resources. In this paper, we develop an efficient point cloud learning network (EPC-Net) to form a global descriptor for visual place recognition, which can obtain good performance and reduce computation memory and inference time. First, we propose a lightweight but effective neural network module, called ProxyConv, to aggregate the local geometric features of point clouds. We leverage the spatial adjacent matrix and proxy points to simplify the original edge convolution for lower memory consumption. Then, we design a lightweight grouped VLAD network (G-VLAD) to form global descriptors for retrieval. Compared with the original VLAD network, we propose a grouped fully connected (GFC) layer to decompose the high-dimensional vectors into a group of low-dimensional vectors, which can reduce the number of parameters of the network and maintain the discrimination of the feature vector. Finally, to further reduce the inference time, we develop a simple version of EPC-Net, called EPC-Net-L, which consists of two ProxyConv modules and one max pooling layer to aggregate global descriptors. By distilling the knowledge from EPC-Net, EPC-Net-L can obtain discriminative global descriptors for retrieval. Extensive experiments on the Oxford dataset and three in-house datasets demonstrate that our proposed method can achieve state-of-the-art performance with lower parameters, FLOPs, and runtime per frame.

updated: Thu Jan 07 2021 05:15:31 GMT+0000 (UTC)

published: Thu Jan 07 2021 05:15:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト