NFormer: Robust Person Re-identification with Neighbor Transformer

Haochen Wang; Jiayi Shen; Yongtuo Liu; Yan Gao; Efstratios Gavves

NFormer：ネイバートランスフォーマーによる堅牢な人物の再識別

人物の再識別は、さまざまなカメラやシナリオで非常に多様な設定の人物を取得することを目的としています。この場合、堅牢で識別力のある表現学習が不可欠です。ほとんどの研究では、単一の画像から表現を学習し、それらの間の潜在的な相互作用を無視することを検討しています。ただし、ID内の変動が大きいため、このような相互作用を無視すると、通常、外れ値の機能が発生します。この問題に取り組むために、Neighbor Transformer Network（NFormer）を提案します。これは、すべての入力画像間の相互作用を明示的にモデル化するため、外れ値の特徴を抑制し、全体としてより堅牢な表現につながります。膨大な量の画像間の相互作用のモデリングは多くの気を散らすものを伴う大規模なタスクであるため、NFormerは2つの新しいモジュール、LandmarkAgentAttentionとReciprocalNeighborSoftmaxを導入します。具体的には、Landmark Agent Attentionは、特徴空間にいくつかのランドマークがある低ランクの因数分解によって、画像間の関係マップを効率的にモデル化します。さらに、Reciprocal Neighbor Softmaxは、すべてではなく、関連するネイバーのみにスパースな注意を向けます。これにより、関連性のない表現の干渉が軽減され、計算負荷がさらに軽減されます。 4つの大規模データセットでの実験で、NFormerは新しい最先端を実現します。コードはhttps://github.com/haochenheheda/NFormerでリリースされています。

Person re-identification aims to retrieve persons in highly varying settings across different cameras and scenarios, in which robust and discriminative representation learning is crucial. Most research considers learning representations from single images, ignoring any potential interactions between them. However, due to the high intra-identity variations, ignoring such interactions typically leads to outlier features. To tackle this issue, we propose a Neighbor Transformer Network, or NFormer, which explicitly models interactions across all input images, thus suppressing outlier features and leading to more robust representations overall. As modelling interactions between enormous amount of images is a massive task with lots of distractors, NFormer introduces two novel modules, the Landmark Agent Attention, and the Reciprocal Neighbor Softmax. Specifically, the Landmark Agent Attention efficiently models the relation map between images by a low-rank factorization with a few landmarks in feature space. Moreover, the Reciprocal Neighbor Softmax achieves sparse attention to relevant -- rather than all -- neighbors only, which alleviates interference of irrelevant representations and further relieves the computational burden. In experiments on four large-scale datasets, NFormer achieves a new state-of-the-art. The code is released at https://github.com/haochenheheda/NFormer.

updated: Wed Apr 20 2022 09:06:47 GMT+0000 (UTC)

published: Wed Apr 20 2022 09:06:47 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト