Temporal Aggregation for Adaptive RGBT Tracking

Zhangyong Tang; Tianyang Xu; Xiao-Jun Wu

アダプティブRGBTトラッキングの時間的集約

RGBTトラッキングで短縮されたRGBおよび熱赤外線（TIR）スペクトルを使用した視覚オブジェクトトラッキングは、今日ますます注目を集めている斬新で挑戦的な研究トピックです。本論文では、ロバストな外観モデル学習のために時空間手がかりを考慮に入れると同時に、クロスモーダル相互作用のための適応融合サブネットワークを構築するRGBTトラッカーを提案します。空間情報のみが含まれるオブジェクト追跡タスクを実装するほとんどの既存のRGBTトラッカーとは異なり、この方法では時間情報がさらに考慮されます。具体的には、テンプレートと検索画像のペアを取得するプロセス中に1つの検索画像のみを取得する従来のシャムトラッカーとは異なり、元の検索サンプルに隣接する追加の検索サンプルを選択して時間変換を予測し、追跡パフォーマンスの堅牢性を向上させます限られたRGBTデータセットに制約されたマルチモーダル追跡に関しては、2つのモダリティに含まれる補完的な特性を反映するために、適応融合サブネットワークが決定レベルでメソッドに追加されます。熱赤外線支援RGBトラッカーを設計するには、RGBモダリティからの残りの接続の前に、TIRモダリティからの分類ヘッドの出力が考慮されます。 3つの挑戦的なデータセット、すなわちVOT-RGBT2019、GTOT、およびRGBT210での広範な実験結果は、私たちの方法の有効性を検証します。コードはbluehttps：//github.com/Zhangyong-Tang/TAATで共有されます。

Visual object tracking with RGB and thermal infrared (TIR) spectra available, shorted in RGBT tracking, is a novel and challenging research topic which draws increasing attention nowadays. In this paper, we propose an RGBT tracker which takes spatio-temporal clues into account for robust appearance model learning, and simultaneously, constructs an adaptive fusion sub-network for cross-modal interactions. Unlike most existing RGBT trackers that implement object tracking tasks with only spatial information included, temporal information is further considered in this method. Specifically, different from traditional Siamese trackers, which only obtain one search image during the process of picking up template-search image pairs, an extra search sample adjacent to the original one is selected to predict the temporal transformation, resulting in improved robustness of tracking performance.As for multi-modal tracking, constrained to the limited RGBT datasets, the adaptive fusion sub-network is appended to our method at the decision level to reflect the complementary characteristics contained in two modalities. To design a thermal infrared assisted RGB tracker, the outputs of the classification head from the TIR modality are taken into consideration before the residual connection from the RGB modality. Extensive experimental results on three challenging datasets, i.e. VOT-RGBT2019, GTOT and RGBT210, verify the effectiveness of our method. Code will be shared at bluehttps://github.com/Zhangyong-Tang/TAAT.

updated: Sat Jan 22 2022 02:31:56 GMT+0000 (UTC)

published: Sat Jan 22 2022 02:31:56 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト