HopFIR: Hop-wise GraphFormer with Intragroup Joint Refinement for 3D Human Pose Estimation

Kai Zhai; Qiang Nie; Bo Ouyang; Xiang Li; ShanLin Yang

HopFIR: 3D 人間の姿勢推定のためのグループ内ジョイントリファインメントを使用したホップワイズ GraphFormer

2D から 3D への人間の姿勢リフティングは、3D 人間の姿勢推定 (HPE) の基本です。グラフ畳み込みネットワーク (GCN) は、人間の骨格トポロジーをモデル化するのに本質的に適していることが証明されています。ただし、現在の GCN ベースの 3D HPE メソッドは、異なるモーションパターンでの関節の相互作用を考慮せずに、隣接ノードの情報を集約することによってノードの機能を更新します。四肢の情報をインポートして動作パターンを学習する研究もありますが、動作中のバランスを維持するなど、関節間の潜在的な相乗効果はほとんど調査されていません。 3D HPE問題に取り組むために、グループ内ジョイントリファインメント（HopFIR）を備えたホップワイズGraphFormerを提案します。 HopFIR は主に、新しいホップワイズ GraphFormer (HGF) モジュールと、末梢関節の改良のために前肢情報を活用するグループ内関節改良 (IJR) モジュールで構成されています。 HGF モジュールは、k ホップネイバーによってジョイントをグループ化し、これらのグループ間でホップワイズトランスフォーマーのような注意メカニズムを利用して、潜在的なジョイントシナジーを発見します。広範な実験結果は、HopFIR が SOTA メソッドよりも大きなマージンを持って優れていることを示しています (Human3.6M データセットでは、平均関節位置誤差 (MPJPE) は 32.67mm です)。さらに、以前の SOTA GCN ベースの方法は、SemGCN と MGCN がそれぞれ 8.9% と 4.5% 改善されるなど、大幅なパフォーマンスの向上により、提案されたホップワイズアテンションメカニズムから効率的に利益を得ることができることも実証されています。

2D-to-3D human pose lifting is fundamental for 3D human pose estimation (HPE). Graph Convolutional Network (GCN) has been proven inherently suitable to model the human skeletal topology. However, current GCN-based 3D HPE methods update the node features by aggregating their neighbors' information without considering the interaction of joints in different motion patterns. Although some studies import limb information to learn the movement patterns, the latent synergies among joints, such as maintaining balance in the motion are seldom investigated. We propose a hop-wise GraphFormer with intragroup joint refinement (HopFIR) to tackle the 3D HPE problem. The HopFIR mainly consists of a novel Hop-wise GraphFormer(HGF) module and an Intragroup Joint Refinement(IJR) module which leverages the prior limb information for peripheral joints refinement. The HGF module groups the joints by k-hop neighbors and utilizes a hop-wise transformer-like attention mechanism among these groups to discover latent joint synergy. Extensive experimental results show that HopFIR outperforms the SOTA methods with a large margin (on the Human3.6M dataset, the mean per joint position error (MPJPE) is 32.67mm). Furthermore, it is also demonstrated that previous SOTA GCN-based methods can benefit from the proposed hop-wise attention mechanism efficiently with significant performance promotion, such as SemGCN and MGCN are improved by 8.9% and 4.5%, respectively.

updated: Tue Jul 18 2023 16:07:55 GMT+0000 (UTC)

published: Tue Feb 28 2023 14:03:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト