CM-NAS: Rethinking Cross-Modality Neural Architectures for Visible-Infrared Person Re-Identification

Chaoyou Fu; Yibo Hu; Xiang Wu; Hailin Shi; Tao Mei; Ran He

CM-NAS：可視赤外線の再識別のためのクロスモダリティニューラルアーキテクチャの再考

Visible-Infrared person re-identification（VI-ReID）は、暗い環境での単一モダリティの人物ReIDの制限を打ち破り、クロスモダリティの歩行者画像を照合することを目的としています。大きなモダリティの不一致の影響を軽減するために、既存の作業では、さまざまな2ストリームアーキテクチャを手動で設計して、モダリティ固有の表現とモダリティ共有可能な表現を別々に学習します。ただし、このような手動の設計ルーチンは、大規模な実験と経験的実践に大きく依存しており、時間と労力がかかります。このホワイトペーパーでは、手動で設計されたアーキテクチャを体系的に調査し、バッチ正規化（BN）レイヤーを適切に分割してモダリティ固有の表現を学習することで、モダリティ間のマッチングに大きな後押しがもたらされることを確認します。この観察に基づいて、本質的な目的は、各BN層に最適な分割スキームを見つけることです。この目的のために、Cross-Modality Neural Architecture Search（CM-NAS）という名前の新しい方法を提案します。これは、クロスモダリティタスクを条件として標準の最適化を実行できるBN指向の検索スペースで構成されています。さらに、検索プロセスをより適切にガイドするために、クラス固有の最大平均不一致（C3MMD）損失に基づく新しい相関整合性をさらに定式化します。モダリティの不一致とは別に、これまで見過ごされてきた2つのモダリティの類似性の相関関係にも関係します。これらの利点に頼って、私たちの方法は、広範な実験で最先端の方法を上回り、SYSU-MM01で6.70％/ 6.13％、RegDBで12.17％/ 11.23％、ランク-1 / mAPを改善します。ソースコードはまもなくリリースされます。

Visible-Infrared person re-identification (VI-ReID) aims at matching cross-modality pedestrian images, breaking through the limitation of single-modality person ReID in dark environment. In order to mitigate the impact of large modality discrepancy, existing works manually design various two-stream architectures to separately learn modality-specific and modality-sharable representations. Such a manual design routine, however, highly depends on massive experiments and empirical practice, which is time consuming and labor intensive. In this paper, we systematically study the manually designed architectures, and identify that appropriately splitting Batch Normalization (BN) layers to learn modality-specific representations will bring a great boost towards cross-modality matching. Based on this observation, the essential objective is to find the optimal splitting scheme for each BN layer. To this end, we propose a novel method, named Cross-Modality Neural Architecture Search (CM-NAS). It consists of a BN-oriented search space in which the standard optimization can be fulfilled subject to the cross-modality task. Besides, in order to better guide the search process, we further formulate a new Correlation Consistency based Class-specific Maximum Mean Discrepancy (C3MMD) loss. Apart from the modality discrepancy, it also concerns the similarity correlations, which have been overlooked before, in the two modalities. Resorting to these advantages, our method outperforms state-of-the-art counterparts in extensive experiments, improving the Rank-1/mAP by 6.70%/6.13% on SYSU-MM01 and 12.17%/11.23% on RegDB. The source code will be released soon.

updated: Thu Jan 21 2021 07:07:00 GMT+0000 (UTC)

published: Thu Jan 21 2021 07:07:00 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト