Beyond Greedy Search: Tracking by Multi-Agent Reinforcement Learning-based Beam Search

Xiao Wang; Zhe Chen; Bo Jiang; Jin Tang; Bin Luo; Dacheng Tao

貪欲な検索を超えて：マルチエージェント強化学習ベースのビームサーチによる追跡

ビデオ内のターゲットを追跡するために、現在のビジュアルトラッカーは通常、各フレームでターゲットオブジェクトのローカリゼーションを貪欲に検索します。つまり、各フレームの追跡結果として、応答スコアが最大の候補領域が選択されます。ただし、これは最適な選択ではない可能性があることがわかりました。特に、重いオクルージョンや速い動きなどの困難な追跡シナリオに遭遇した場合はそうです。この問題に対処するために、複数の追跡軌道を維持し、視覚的追跡にビームサーチ戦略を適用して、累積エラーが少ない軌道を識別できるようにすることを提案します。したがって、この論文では、BeamTrackingと呼ばれる新しいマルチエージェント強化学習ベースのビームサーチ追跡戦略を紹介します。これは主に、画像を入力として受け取り、ビームサーチアルゴリズムを使用してさまざまな説明を生成する画像キャプションタスクに触発されています。したがって、追跡は、複数の並列意思決定プロセスによって満たされるサンプル選択問題として定式化されます。各プロセスは、各フレームでの追跡結果として1つのサンプルを選択することを目的としています。維持されている各軌道は、意思決定を実行し、関連情報を更新するために実行する必要があるアクションを決定するために、エージェントに関連付けられています。すべてのフレームが処理されたら、追跡結果として累積スコアが最大の軌道を選択します。 7つの人気のある追跡ベンチマークデータセットでの広範な実験により、提案されたアルゴリズムの有効性が検証されました。

To track the target in a video, current visual trackers usually adopt greedy search for target object localization in each frame, that is, the candidate region with the maximum response score will be selected as the tracking result of each frame. However, we found that this may be not an optimal choice, especially when encountering challenging tracking scenarios such as heavy occlusion and fast motion. To address this issue, we propose to maintain multiple tracking trajectories and apply beam search strategy for visual tracking, so that the trajectory with fewer accumulated errors can be identified. Accordingly, this paper introduces a novel multi-agent reinforcement learning based beam search tracking strategy, termed BeamTracking. It is mainly inspired by the image captioning task, which takes an image as input and generates diverse descriptions using beam search algorithm. Accordingly, we formulate the tracking as a sample selection problem fulfilled by multiple parallel decision-making processes, each of which aims at picking out one sample as their tracking result in each frame. Each maintained trajectory is associated with an agent to perform the decision-making and determine what actions should be taken to update related information. When all the frames are processed, we select the trajectory with the maximum accumulated score as the tracking result. Extensive experiments on seven popular tracking benchmark datasets validated the effectiveness of the proposed algorithm.

updated: Tue Aug 30 2022 11:20:38 GMT+0000 (UTC)

published: Thu May 19 2022 16:35:36 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト