G^2DA: Geometry-Guided Dual-Alignment Learning for RGB-Infrared Person Re-Identification

Lin Wan; Zongyuan Sun; Qianyan Jing; Yehansen Chen; Lijing Lu; Zhihang Li

G ^ 2DA：RGB赤外線による人物の再識別のためのジオメトリガイド付きデュアルアライメント学習

RGB-赤外線（IR）人物の再識別は、さまざまな感覚デバイスによって引き起こされる大きなモダリティの不一致に悩まされている、異種のモダリティ間の関心のある人物を取得することを目的としています。既存の方法は主にグローバルレベルのモダリティアラインメントに焦点を合わせていますが、サンプルレベルのモダリティの相違をある程度無視すると、パフォーマンスが低下します。このホワイトペーパーでは、サンプルレベルのモダリティの違いに取り組むことでRGB-IR ReIDソリューションを見つけようとし、モダリティの不変性を共同で強化し、特徴の人間の位相構造との識別性を強化するジオメトリガイドデュアルアライメント学習フレームワーク（G ^ 2DA）を紹介します。全体的なマッチングパフォーマンスを向上させます。具体的には、G ^ 2DAは、ポーズ推定器を使用して正確な身体部分の特徴を抽出し、グローバル記述子で欠落しているローカルの詳細を補完するセマンティックブリッジとして機能します。抽出されたローカルおよびグローバルな特徴に基づいて、最適な輸送から導出された新しい分布制約が導入され、細粒度のサンプルレベルの方法でモダリティギャップが緩和されます。 2つのモダリティ間のペアワイズ関係を超えて、異なるパーツの構造的類似性をさらに測定するため、マルチレベル機能とそれらの関係の両方が共通の機能空間で一貫性を保ちます。固有の人間トポロジー情報を考慮して、ジオメトリガイドグラフ学習モジュールをさらに進めて、各パーツの特徴を改良します。関連する領域を強調し、意味のない領域を抑制して、堅牢な特徴学習を効果的に促進します。 2つの標準ベンチマークデータセットでの広範な実験により、提案された方法の優位性が検証され、最先端のアプローチよりも競争力のあるパフォーマンスが得られます。

RGB-Infrared (IR) person re-identification aims to retrieve person-of-interest between heterogeneous modalities, suffering from large modality discrepancy caused by different sensory devices. Existing methods mainly focus on global-level modality alignment, whereas neglect sample-level modality divergence to some extent, leading to performance degradation. This paper attempts to find RGB-IR ReID solutions from tackling sample-level modality difference, and presents a Geometry-Guided Dual-Alignment learning framework (G^2DA), which jointly enhances modality-invariance and reinforces discriminability with human topological structure in features to boost the overall matching performance. Specifically, G^2DA extracts accurate body part features with a pose estimator, serving as a semantic bridge complementing the missing local details in global descriptor. Based on extracted local and global features, a novel distribution constraint derived from optimal transport is introduced to mitigate the modality gap in a fine-grained sample-level manner. Beyond pair-wise relations across two modalities, it additionally measures the structural similarity of different parts, thus both multi-level features and their relations are kept consistent in the common feature space. Considering the inherent human-topology information, we further advance a geometry-guided graph learning module to refine each part features, where relevant regions can be emphasized while meaningless ones are suppressed, effectively facilitating robust feature learning. Extensive experiments on two standard benchmark datasets validate the superiority of our proposed method, yielding competitive performance over the state-of-the-art approaches.

updated: Tue Jun 15 2021 03:14:31 GMT+0000 (UTC)

published: Tue Jun 15 2021 03:14:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト