Hybrid-Attention Guided Network with Multiple Resolution Features for Person Re-Identification

Guoqing Zhang; Junchuan Yang; Yuhui Zheng; Yi Wu; Shengyong Chen

人の再識別のための複数の解決機能を備えたハイブリッド注意ガイド付きネットワーク

やりがいのある人物の再識別（re-ID）タスクに対処するには、効果的で差別的な機能を抽出することが非常に重要です。一般的なディープたたみ込みニューラルネットワーク（CNN）は、通常、歩行者を識別するために高レベルの機能を使用します。ただし、形状、テクスチャ、色などの低レベルの機能に存在する重要な空間情報の一部は、トレーニング段階での広範なパディングとプーリング操作により、高レベルの機能を学習すると失われます。さらに、ほとんどの既存の人物再IDメソッドは、主に画像が正確に位置合わせされる手作りの境界ボックスに基づいています。悪用されたオブジェクト検出アルゴリズムはしばしば不正確な境界ボックスを生成するため、実際のアプリケーションでは非現実的です。これにより、既存のアルゴリズムのパフォーマンスが必然的に低下します。これらの問題に対処するために、高レベルの埋め込みと低レベルの埋め込みを融合して、高レベルの機能の学習で発生する情報の損失を減らす、新しい人物再IDモデルを提案します。次に、融合した埋め込みをいくつかの部分に分割し、それらを再接続して、グローバルフィーチャーとより重要なローカルフィーチャーを取得し、不正確なバウンディングボックスによる影響を軽減します。さらに、モデルに空間的およびチャネル注意メカニズムを導入します。これは、ターゲットに関連するより特徴的な機能をマイニングすることを目的としています。最後に、特徴抽出機能を再構築して、モデルがより豊富で堅牢な特徴を取得できるようにします。広範な実験は、既存のアプローチと比較して私たちのアプローチの優位性を示しています。コードはhttps://github.com/libraflower/MutipleFeature-for-PRIDで入手できます。

Extracting effective and discriminative features is very important for addressing the challenging person re-identification (re-ID) task. Prevailing deep convolutional neural networks (CNNs) usually use high-level features for identifying pedestrian. However, some essential spatial information resided in low-level features such as shape, texture and color will be lost when learning the high-level features, due to extensive padding and pooling operations in the training stage. In addition, most existing person re-ID methods are mainly based on hand-craft bounding boxes where images are precisely aligned. It is unrealistic in practical applications, since the exploited object detection algorithms often produce inaccurate bounding boxes. This will inevitably degrade the performance of existing algorithms. To address these problems, we put forward a novel person re-ID model that fuses high- and low-level embeddings to reduce the information loss caused in learning high-level features. Then we divide the fused embedding into several parts and reconnect them to obtain the global feature and more significant local features, so as to alleviate the affect caused by the inaccurate bounding boxes. In addition, we also introduce the spatial and channel attention mechanisms in our model, which aims to mine more discriminative features related to the target. Finally, we reconstruct the feature extractor to ensure that our model can obtain more richer and robust features. Extensive experiments display the superiority of our approach compared with existing approaches. Our code is available at https://github.com/libraflower/MutipleFeature-for-PRID.

updated: Sun Jun 06 2021 03:05:03 GMT+0000 (UTC)

published: Wed Sep 16 2020 08:12:42 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト