PVT++: A Simple End-to-End Latency-Aware Visual Tracking Framework

Bowen Li; Ziyuan Huang; Junjie Ye; Yiming Li; Sebastian Scherer; Hang Zhao; Changhong Fu

PVT++: シンプルなエンドツーエンドのレイテンシー対応ビジュアルトラッキングフレームワーク

視覚オブジェクト追跡は、インテリジェントロボットに不可欠な機能です。既存のアプローチのほとんどは、実際の処理中に深刻なパフォーマンス低下を引き起こす可能性があるオンラインレイテンシを無視してきました。特に無人航空機の場合、堅牢な追跡がより難しく、搭載計算が制限されているため、遅延の問題は致命的となる可能性があります。この作業では、エンドツーエンドのレイテンシーを意識した追跡、つまりエンドツーエンドの予測ビジュアルトラッキング (PVT++) のための単純なフレームワークを提示します。 PVT++ は、オンライン予測子を追加することで、ほとんどの最先端トラッカーを予測トラッカーに変えることができます。モデルベースのアプローチを使用する既存のソリューションとは異なり、私たちのフレームワークは学習可能であり、入力としてモーション情報だけでなく、視覚的な合図または両方の組み合わせを利用することもできます。さらに、PVT++ はエンドツーエンドで最適化できるため、共同トレーニングによってレイテンシを考慮した追跡パフォーマンスをさらに向上させることができます。さらに、この作業は、オンライン設定で任意の速度のトラッカーを評価するための、拡張された遅延を考慮した評価ベンチマークを示しています。空中から見たロボットプラットフォームでの実験結果は、PVT++ がさまざまなトラッカーで最大 60% のパフォーマンス向上を達成し、以前のモデルベースのソリューションよりも優れた堅牢性を示し、レイテンシーによる劣化を大幅に軽減できることを示しています。コードとモデルは公開されます。

Visual object tracking is an essential capability of intelligent robots. Most existing approaches have ignored the online latency that can cause severe performance degradation during real-world processing. Especially for unmanned aerial vehicle, where robust tracking is more challenging and onboard computation is limited, latency issue could be fatal. In this work, we present a simple framework for end-to-end latency-aware tracking, i.e., end-to-end predictive visual tracking (PVT++). PVT++ is capable of turning most leading-edge trackers into predictive trackers by appending an online predictor. Unlike existing solutions that use model-based approaches, our framework is learnable, such that it can take not only motion information as input but it can also take advantage of visual cues or a combination of both. Moreover, since PVT++ is end-to-end optimizable, it can further boost the latency-aware tracking performance by joint training. Additionally, this work presents an extended latency-aware evaluation benchmark for assessing an any-speed tracker in the online setting. Empirical results on robotic platform from aerial perspective show that PVT++ can achieve up to 60% performance gain on various trackers and exhibit better robustness than prior model-based solution, largely mitigating the degradation brought by latency. Code and models will be made public.

updated: Mon Nov 21 2022 16:43:33 GMT+0000 (UTC)

published: Mon Nov 21 2022 16:43:33 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト