A Video Is Worth Three Views: Trigeminal Transformers for Video-based Person Re-identification

Xuehu Liu; Pingping Zhang; Chenyang Yu; Huchuan Lu; Xuesheng Qian; Xiaoyun Yang

ビデオは3回見る価値があります：ビデオベースの人の再識別のための三叉神経トランスフォーマー

ビデオベースの人物再識別（Re-ID）は、重複しないカメラの下で同じ人物のビデオシーケンスを取得することを目的としています。以前の方法は通常、空間的、時間的、または時空間的ビューなど、さまざまな機能ドメインでの観測が不足している限られたビューに焦点を当てています。より豊かな知覚をキャプチャし、より包括的なビデオ表現を抽出するために、この論文では、ビデオベースの人物Re-ID用のTrigeminal Transformers（TMT）という名前の新しいフレームワークを提案します。より具体的には、生のビデオデータを空間的、時間的、および時空間的ドメインに共同で変換するための三叉神経特徴抽出器を設計します。さらに、ビジョントランスフォーマーの大成功に触発されて、ビデオベースの人物Re-ID用のトランスフォーマー構造を紹介します。私たちの仕事では、空間的、時間的、および時空間的ドメインでの情報強化のためにローカル機能間の関係を活用するために、3つのセルフビュートランスフォーマーが提案されています。さらに、包括的なビデオ表現のためにマルチビュー機能を集約するために、クロスビュートランスフォーマーが提案されています。実験結果は、私たちのアプローチが、公開Re-IDベンチマークで他の最先端のアプローチよりも優れたパフォーマンスを達成できることを示しています。モデル再現用のコードを公開します。

Video-based person re-identification (Re-ID) aims to retrieve video sequences of the same person under non-overlapping cameras. Previous methods usually focus on limited views, such as spatial, temporal or spatial-temporal view, which lack of the observations in different feature domains. To capture richer perceptions and extract more comprehensive video representations, in this paper we propose a novel framework named Trigeminal Transformers (TMT) for video-based person Re-ID. More specifically, we design a trigeminal feature extractor to jointly transform raw video data into spatial, temporal and spatial-temporal domain. Besides, inspired by the great success of vision transformer, we introduce the transformer structure for video-based person Re-ID. In our work, three self-view transformers are proposed to exploit the relationships between local features for information enhancement in spatial, temporal and spatial-temporal domains. Moreover, a cross-view transformer is proposed to aggregate the multi-view features for comprehensive video representations. The experimental results indicate that our approach can achieve better performance than other state-of-the-art approaches on public Re-ID benchmarks. We will release the code for model reproduction.

updated: Mon Apr 05 2021 02:50:16 GMT+0000 (UTC)

published: Mon Apr 05 2021 02:50:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト