High-Performance Transformer Tracking

Xin Chen; Bin Yan; Jiawen Zhu; Huchuan Lu; Xiang Ruan; Dong Wang

高性能変圧器トラッキング

相関関係は追跡分野で重要な役割を果たしており、特に最近人気のあるシャムベースのトラッカーでは重要です。相関操作は、テンプレートと検索領域の類似性を考慮した単純な融合方法です。ただし、相関操作はローカル線形マッチングプロセスであり、セマンティック情報を失い、ローカル最適に陥りやすく、高精度の追跡アルゴリズムを設計する際のボトルネックになる可能性があります。この作業では、相関よりも優れた機能融合方法が存在するかどうかを判断するために、トランスフォーマーに触発された新しい注意ベースの機能融合ネットワークが提示されます。このネットワークは、注意を使用してテンプレートと検索領域の機能を効果的に組み合わせます。具体的には、提案された方法には、自己注意に基づく自我コンテキスト拡張モジュールと、相互注意に基づく機能間拡張モジュールが含まれます。最初に、シャムのような特徴抽出バックボーン、設計された注意ベースの融合メカニズム、および分類と回帰ヘッドに基づくトランスフォーマー追跡 (TransT と呼ばれる) メソッドを提示します。 TransT ベースラインに基づいて、セグメンテーションブランチをさらに設計し、正確なマスクを生成します。最後に、TransT-M という名前のマルチテンプレートスキームと IoU 予測ヘッドを使用して TransT を拡張することにより、TransT のより強力なバージョンを提案します。実験では、TransT および TransT-M メソッドが 7 つの一般的なデータセットで有望な結果を達成することが示されています。コードとモデルは https://github.com/chenxin-dlut/TransT-M で入手できます。

Correlation has a critical role in the tracking field, especially in recent popular Siamese-based trackers. The correlation operation is a simple fusion method that considers the similarity between the template and the search region. However, the correlation operation is a local linear matching process, losing semantic information and easily falling into a local optimum, which may be the bottleneck in designing high-accuracy tracking algorithms. In this work, to determine whether a better feature fusion method exists than correlation, a novel attention-based feature fusion network, inspired by the transformer, is presented. This network effectively combines the template and search region features using attention. Specifically, the proposed method includes an ego-context augment module based on self-attention and a cross-feature augment module based on cross-attention. First, we present a transformer tracking (named TransT) method based on the Siamese-like feature extraction backbone, the designed attention-based fusion mechanism, and the classification and regression head. Based on the TransT baseline, we further design a segmentation branch to generate an accurate mask. Finally, we propose a stronger version of TransT by extending TransT with a multi-template scheme and an IoU prediction head, named TransT-M. Experiments show that our TransT and TransT-M methods achieve promising results on seven popular datasets. Code and models are available at https://github.com/chenxin-dlut/TransT-M.

updated: Wed Nov 23 2022 08:45:18 GMT+0000 (UTC)

published: Fri Mar 25 2022 09:33:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト