TranSG: Transformer-Based Skeleton Graph Prototype Contrastive Learning with Structure-Trajectory Prompted Reconstruction for Person Re-Identification

Haocong Rao; Chunyan Miao

TranSG: トランスフォーマーベースのスケルトングラフプロトタイプ対比学習と構造軌道プロンプトによる人物再識別のための再構築

3D スケルトンデータによる個人の再識別 (再 ID) は、顕著な利点を持つ新たなトピックです。既存の方法は通常、未加工のボディジョイントを使用してスケルトン記述子を設計するか、スケルトンシーケンス表現学習を実行します。ただし、通常、異なる身体コンポーネントの関係を同時にモデル化することはできず、身体関節のきめの細かい表現から有用なセマンティクスを探索することはほとんどありません。このホワイトペーパーでは、一般的な Transformer ベースのスケルトングラフプロトタイプの対照的学習 (TranSG) アプローチを提案します。これには、構造軌跡が再構成を促し、スケルトングラフから骨格関係と貴重な時空間セマンティクスを完全にキャプチャして、人物の再 ID を取得します。具体的には、主要な相関ノード機能をグラフ表現に集約するために、スケルトングラフ内の身体と運動の関係を同時に学習するスケルトングラフトランスフォーマー (SGT) を最初に考案します。次に、Graph Prototype Contrastive learning (GPC) を提案して、各 ID の最も典型的なグラフ機能 (グラフプロトタイプ) をマイニングし、グラフ表現と、スケルトンレベルとシーケンスレベルの両方からの異なるプロトタイプとの間の固有の類似性を対比して、差別的なグラフ表現を学習します。最後に、グラフノードの空間的および時間的コンテキストを利用してスケルトングラフの再構築を促すグラフ構造 - 軌跡プロンプト再構築 (STPR) メカニズムが提案されています。経験的評価は、TranSG が既存の最先端の方法よりも大幅に優れていることを示しています。さらに、さまざまなグラフモデリング、RGB 推定スケルトン、および教師なしシナリオの下での一般性を示します。

Person re-identification (re-ID) via 3D skeleton data is an emerging topic with prominent advantages. Existing methods usually design skeleton descriptors with raw body joints or perform skeleton sequence representation learning. However, they typically cannot concurrently model different body-component relations, and rarely explore useful semantics from fine-grained representations of body joints. In this paper, we propose a generic Transformer-based Skeleton Graph prototype contrastive learning (TranSG) approach with structure-trajectory prompted reconstruction to fully capture skeletal relations and valuable spatial-temporal semantics from skeleton graphs for person re-ID. Specifically, we first devise the Skeleton Graph Transformer (SGT) to simultaneously learn body and motion relations within skeleton graphs, so as to aggregate key correlative node features into graph representations. Then, we propose the Graph Prototype Contrastive learning (GPC) to mine the most typical graph features (graph prototypes) of each identity, and contrast the inherent similarity between graph representations and different prototypes from both skeleton and sequence levels to learn discriminative graph representations. Last, a graph Structure-Trajectory Prompted Reconstruction (STPR) mechanism is proposed to exploit the spatial and temporal contexts of graph nodes to prompt skeleton graph reconstruction, which facilitates capturing more valuable patterns and graph semantics for person re-ID. Empirical evaluations demonstrate that TranSG significantly outperforms existing state-of-the-art methods. We further show its generality under different graph modeling, RGB-estimated skeletons, and unsupervised scenarios.

updated: Mon Mar 13 2023 02:27:45 GMT+0000 (UTC)

published: Mon Mar 13 2023 02:27:45 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト