A Semantic-aware Attention and Visual Shielding Network for Cloth-changing Person Re-identification

Zan Gao; Hongwei Wei; Weili Guan; Jie Nie; Meng Wang; Shenyong Chen

布を変える人の再識別のための意味論を意識した注意と視覚的シールドネットワーク

着替え者再識別（ReID）は、着替えた歩行者を回収することを目的とした新たな研究テーマです。異なる服を着た人間の外見は大きな変化を示すため、既存のアプローチでは、識別可能で堅牢な特徴表現を抽出することは非常に困難です。現在の作品は主に体型や輪郭スケッチに焦点を当てていますが、人間の意味情報と着替え前後の歩行者の特徴の潜在的な一貫性は十分に調査されていないか、無視されています。これらの問題を解決するために、この作業では、衣服を着替える人のReID（SAVSと略記）のための新しい意味認識注意と視覚的シールドネットワークを提案します。ここで重要なアイデアは、衣服の外観に関連する手がかりをシールドし、ビュー/姿勢の変化に敏感ではない視覚的な意味情報。具体的には、視覚的意味エンコーダを最初に使用して、人間の意味セグメンテーション情報に基づいて人体および衣服の領域を特定する。次に、人間の意味情報を強調し、視覚的特徴マップを再重み付けするために、人間の意味注意モジュール（HSA）が提案されます。さらに、視覚的な衣服シールドモジュール（VCS）は、衣服の領域をカバーし、衣服に関係のない視覚的な意味情報にモデルを集中させることにより、衣服を交換するタスクのより堅牢な特徴表現を抽出するようにも設計されています。最も重要なことは、これら2つのモジュールが、エンドツーエンドの統合フレームワークで共同で検討されていることです。広範な実験は、提案された方法が最先端の方法を大幅に上回り、布を交換する人のためにより堅牢な特徴を抽出できることを示しています。 FSAM（CVPR 2021で公開）と比較すると、この方法では、mAP（ランク1）に関してLTCCおよびPRCCデータセットでそれぞれ32.7％（16.5％）および14.9％（-）の改善を達成できます。

Cloth-changing person reidentification (ReID) is a newly emerging research topic that aims to retrieve pedestrians whose clothes are changed. Since the human appearance with different clothes exhibits large variations, it is very difficult for existing approaches to extract discriminative and robust feature representations. Current works mainly focus on body shape or contour sketches, but the human semantic information and the potential consistency of pedestrian features before and after changing clothes are not fully explored or are ignored. To solve these issues, in this work, a novel semantic-aware attention and visual shielding network for cloth-changing person ReID (abbreviated as SAVS) is proposed where the key idea is to shield clues related to the appearance of clothes and only focus on visual semantic information that is not sensitive to view/posture changes. Specifically, a visual semantic encoder is first employed to locate the human body and clothing regions based on human semantic segmentation information. Then, a human semantic attention module (HSA) is proposed to highlight the human semantic information and reweight the visual feature map. In addition, a visual clothes shielding module (VCS) is also designed to extract a more robust feature representation for the cloth-changing task by covering the clothing regions and focusing the model on the visual semantic information unrelated to the clothes. Most importantly, these two modules are jointly explored in an end-to-end unified framework. Extensive experiments demonstrate that the proposed method can significantly outperform state-of-the-art methods, and more robust features can be extracted for cloth-changing persons. Compared with FSAM (published in CVPR 2021), this method can achieve improvements of 32.7% (16.5%) and 14.9% (-) on the LTCC and PRCC datasets in terms of mAP (rank-1), respectively.

updated: Mon Jul 18 2022 05:38:37 GMT+0000 (UTC)

published: Mon Jul 18 2022 05:38:37 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト