DistillPose: Lightweight Camera Localization Using Auxiliary Learning

Yehya Abouelnaga; Mai Bui; Slobodan Ilic

DistillPose：補助学習を使用した軽量カメラのローカリゼーション

RGB画像から6DOFカメラのポーズを予測するための軽量の検索ベースのパイプラインを提案します。私たちのパイプラインは、畳み込みニューラルネットワーク（CNN）を使用して、クエリ画像を特徴ベクトルとしてエンコードします。最近傍ルックアップは、ポーズごとに最も近いデータベース画像を見つけます。シャム畳み込みニューラルネットワークは、最も近い隣接データベース画像からクエリ画像への相対ポーズを回帰します。次に、相対ポーズが最も近い隣接する絶対ポーズに適用され、クエリ画像の最終的な絶対ポーズ予測が取得されます。私たちのモデルはNN-Netの蒸留バージョンであり、ローカリゼーションの精度を大幅に低下させることなく、パラメーターを98.87％、情報検索機能のベクトルサイズを87.5％、推論時間を89.18％削減します。

We propose a lightweight retrieval-based pipeline to predict 6DOF camera poses from RGB images. Our pipeline uses a convolutional neural network (CNN) to encode a query image as a feature vector. A nearest neighbor lookup finds the pose-wise nearest database image. A siamese convolutional neural network regresses the relative pose from the nearest neighboring database image to the query image. The relative pose is then applied to the nearest neighboring absolute pose to obtain the query image's final absolute pose prediction. Our model is a distilled version of NN-Net that reduces its parameters by 98.87%, information retrieval feature vector size by 87.5%, and inference time by 89.18% without a significant decrease in localization accuracy.

updated: Mon Aug 09 2021 05:48:24 GMT+0000 (UTC)

published: Mon Aug 09 2021 05:48:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト