TransVPR: Transformer-based place recognition with multi-level attention aggregation

Ruotong Wang; Yanqing Shen; Weiliang Zuo; Sanping Zhou; Nanning Zheng

TransVPR：マルチレベルの注意集約を備えたトランスフォーマーベースの場所認識

視覚的な場所の認識は、自動運転ナビゲーションや移動ロボットの位置特定などのアプリケーションにとって困難な作業です。複雑なシーンに存在する気を散らす要素は、視覚的な場所の知覚に逸脱をもたらすことがよくあります。この問題に対処するには、タスク関連の領域のみからの情報を画像表現に統合することが重要です。この論文では、ビジョントランスフォーマーに基づく新しい全体的な場所認識モデルであるTransVPRを紹介します。これは、タスク関連の機能を自然に集約できるTransformersの自己注意操作の望ましい特性から恩恵を受けます。さまざまな関心領域に焦点を当てたTransformerの複数のレベルからの注意がさらに組み合わされて、グローバルな画像表現が生成されます。さらに、融合アテンションマスクによってフィルタリングされたTransformerレイヤーからの出力トークンは、キーパッチ記述子と見なされます。これは、空間マッチングを実行して、グローバル画像特徴によって取得された候補を再ランク付けするために使用されます。モデル全体で、単一の目的と画像レベルの監視によるエンドツーエンドのトレーニングが可能になります。 TransVPRは、低い計算時間とストレージ要件を維持しながら、いくつかの実際のベンチマークで最先端のパフォーマンスを実現します。

Visual place recognition is a challenging task for applications such as autonomous driving navigation and mobile robot localization. Distracting elements presenting in complex scenes often lead to deviations in the perception of visual place. To address this problem, it is crucial to integrate information from only task-relevant regions into image representations. In this paper, we introduce a novel holistic place recognition model, TransVPR, based on vision Transformers. It benefits from the desirable property of the self-attention operation in Transformers which can naturally aggregate task-relevant features. Attentions from multiple levels of the Transformer, which focus on different regions of interest, are further combined to generate a global image representation. In addition, the output tokens from Transformer layers filtered by the fused attention mask are considered as key-patch descriptors, which are used to perform spatial matching to re-rank the candidates retrieved by the global image features. The whole model allows end-to-end training with a single objective and image-level supervision. TransVPR achieves state-of-the-art performance on several real-world benchmarks while maintaining low computational time and storage requirements.

updated: Thu Mar 03 2022 01:37:34 GMT+0000 (UTC)

published: Thu Jan 06 2022 10:20:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト