Hierarchical Skeleton Meta-Prototype Contrastive Learning with Hard Skeleton Mining for Unsupervised Person Re-Identification

Haocong Rao; Cyril Leung; Chunyan Miao

教師なし人物再識別のためのハードスケルトンマイニングによる階層的スケルトンメタプロトタイプ対照学習

深度センサーと深層学習の急速な進歩により、スケルトンベースの個人再識別 (re-ID) モデルは、多くの利点を伴って最近目覚ましい進歩を遂げています。既存のソリューションのほとんどは、骨格の重要性が等しいという前提で体の関節から単一レベルの骨格特徴を学習しますが、通常、よりグローバルな身体パターンを持つ手足レベルなど、さまざまなレベルからより有益な骨格特徴を活用する機能が不足しています。これらのメソッドのラベル依存性により、より一般的なスケルトン表現を学習する際の柔軟性も制限されます。この論文では、ラベルのない 3D スケルトンによる人物の再 ID のための、ハードスケルトンマイニング (HSM) を使用した一般的な教師なし階層スケルトンメタプロトタイプ対照学習 (Hi-MPC) アプローチを提案します。まず、骨格の階層表現を構築し、体の関節、コンポーネント、四肢のレベルから大まかな体と動きの特徴をモデル化します。次に、最も典型的なスケルトンの特徴 (「プロトタイプ」) を異なるレベルのスケルトンからクラスター化して対比するための、階層的なメタプロトタイプ対比学習モデルが提案されます。元のプロトタイプを複数の同種変換を含むメタプロトタイプに変換することで、モデルがプロトタイプの固有の一貫性を学習し、人物の再 ID のためにより効果的な骨格特徴を捕捉します。さらに、より識別可能なスケルトン表現を学習するためにより硬いスケルトンに焦点を当てるために、各スケルトンの有益な重要性を適応的に推論するハードスケルトンマイニングメカニズムを考案します。 5 つのデータセットに対する広範な評価により、私たちのアプローチがさまざまな最先端のスケルトンベースの方法よりも優れていることが実証されました。さらに、推定された骨格を使用して個人の再 ID と RGB ベースのシナリオをクロスビューするためのこの方法の一般的な適用性を示します。

With rapid advancements in depth sensors and deep learning, skeleton-based person re-identification (re-ID) models have recently achieved remarkable progress with many advantages. Most existing solutions learn single-level skeleton features from body joints with the assumption of equal skeleton importance, while they typically lack the ability to exploit more informative skeleton features from various levels such as limb level with more global body patterns. The label dependency of these methods also limits their flexibility in learning more general skeleton representations. This paper proposes a generic unsupervised Hierarchical skeleton Meta-Prototype Contrastive learning (Hi-MPC) approach with Hard Skeleton Mining (HSM) for person re-ID with unlabeled 3D skeletons. Firstly, we construct hierarchical representations of skeletons to model coarse-to-fine body and motion features from the levels of body joints, components, and limbs. Then a hierarchical meta-prototype contrastive learning model is proposed to cluster and contrast the most typical skeleton features ("prototypes") from different-level skeletons. By converting original prototypes into meta-prototypes with multiple homogeneous transformations, we induce the model to learn the inherent consistency of prototypes to capture more effective skeleton features for person re-ID. Furthermore, we devise a hard skeleton mining mechanism to adaptively infer the informative importance of each skeleton, so as to focus on harder skeletons to learn more discriminative skeleton representations. Extensive evaluations on five datasets demonstrate that our approach outperforms a wide variety of state-of-the-art skeleton-based methods. We further show the general applicability of our method to cross-view person re-ID and RGB-based scenarios with estimated skeletons.

updated: Sat Sep 16 2023 03:05:02 GMT+0000 (UTC)

published: Mon Jul 24 2023 16:18:22 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト