HyperPose: Camera Pose Localization using Attention Hypernetworks

Ron Ferens; Yosi Keller

HyperPose: Attention Hypernetworks を使用したカメラポーズのローカリゼーション

この研究では、カメラの姿勢の位置特定におけるアテンションハイパーネットワークの使用を提案します。環境、遠近法、および照明の変化を含む自然シーンの動的な性質は、トレーニングセットとテストセットの間に固有のドメインギャップを作成し、現代のローカリゼーションネットワークの精度を制限します。この問題を克服するために、ハイパーネットワークを統合するカメラポーズリグレッサーを提案します。推論中、ハイパーネットワークは入力画像に基づいてローカリゼーション回帰ヘッドの適応重みを生成し、ドメインギャップを効果的に減らします。注意ハイパーネットワークを導出するために、一般的な多層パーセプトロンの代わりに、ハイパーネットワークとして Transformer-Encoder を使用することもお勧めします。提案されたアプローチは、現代のデータセットに対する最先端の方法と比較して優れた結果を達成します。私たちの知る限り、これはカメラポーズ回帰でハイパーネットワークを使用し、ハイパーネットワークとして Transformer-Encoders を使用した最初の例です。コードを公開します。

In this study, we propose the use of attention hypernetworks in camera pose localization. The dynamic nature of natural scenes, including changes in environment, perspective, and lighting, creates an inherent domain gap between the training and test sets that limits the accuracy of contemporary localization networks. To overcome this issue, we suggest a camera pose regressor that integrates a hypernetwork. During inference, the hypernetwork generates adaptive weights for the localization regression heads based on the input image, effectively reducing the domain gap. We also suggest the use of a Transformer-Encoder as the hypernetwork, instead of the common multilayer perceptron, to derive an attention hypernetwork. The proposed approach achieves superior results compared to state-of-the-art methods on contemporary datasets. To the best of our knowledge, this is the first instance of using hypernetworks in camera pose regression, as well as using Transformer-Encoders as hypernetworks. We make our code publicly available.

updated: Sun Mar 05 2023 08:45:50 GMT+0000 (UTC)

published: Sun Mar 05 2023 08:45:50 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト