HopFIR: Hop-wise GraphFormer with Intragroup Joint Refinement for 3D Human Pose Estimation

Kai Zhai; Qiang Nie; Bo Ouyang; Xiang Li; Shanlin Yang

HopFIR: 3D 人間の姿勢推定のためのグループ内結合リファインメントを備えたホップワイズ GraphFormer

2D から 3D への人間の姿勢リフティングは、3D 人間の姿勢推定 (HPE) の基礎であり、グラフ畳み込みネットワーク (GCN) が人間の骨格トポロジーのモデリングに本質的に適していることが証明されています。ただし、現在の GCN ベースの 3D HPE メソッドは、異なるジョイントシナジーにおけるジョイントの相互作用を考慮せずに、隣接ノードの情報を集約することによってノードの特徴を更新します。いくつかの研究では、四肢の情報をインポートして動作パターンを学習することを提案していますが、バランスの維持など、関節間の潜在的な相乗効果についてはほとんど調査されていません。 3D HPE 問題に取り組むために、Hop-wise GraphFormer with Intragroup Joint Refinement (HopFIR) アーキテクチャを提案します。 HopFIR は主に、新しいホップワイズ GraphFormer (HGF) モジュールとグループ内結合リファインメント (IJR) モジュールで構成されます。 HGF モジュールは、k ホップ近傍によってジョイントをグループ化し、ホップごとのトランスフォーマーのような注意メカニズムをこれらのグループに適用して、潜在的なジョイントの相乗効果を発見します。 IJR モジュールは、末梢関節の改良のために以前の四肢情報を活用します。広範な実験結果は、HopFIR が SOTA 法よりも大幅に優れ、Human3.6M データセットの関節あたりの平均位置誤差 (MPJPE) が 32.67 mm であることを示しています。また、最先端の GCN ベースの手法が、提案されたホップ単位のアテンションメカニズムの恩恵を受けてパフォーマンスが大幅に向上することも実証します。SemGCN と MGCN はそれぞれ 8.9% と 4.5% 向上しました。

2D-to-3D human pose lifting is fundamental for 3D human pose estimation (HPE), for which graph convolutional networks (GCNs) have proven inherently suitable for modeling the human skeletal topology. However, the current GCN-based 3D HPE methods update the node features by aggregating their neighbors' information without considering the interaction of joints in different joint synergies. Although some studies have proposed importing limb information to learn the movement patterns, the latent synergies among joints, such as maintaining balance are seldom investigated. We propose the Hop-wise GraphFormer with Intragroup Joint Refinement (HopFIR) architecture to tackle the 3D HPE problem. HopFIR mainly consists of a novel hop-wise GraphFormer (HGF) module and an intragroup joint refinement (IJR) module. The HGF module groups the joints by k-hop neighbors and applies a hopwise transformer-like attention mechanism to these groups to discover latent joint synergies. The IJR module leverages the prior limb information for peripheral joint refinement. Extensive experimental results show that HopFIR outperforms the SOTA methods by a large margin, with a mean per-joint position error (MPJPE) on the Human3.6M dataset of 32.67 mm. We also demonstrate that the state-of-the-art GCN-based methods can benefit from the proposed hop-wise attention mechanism with a significant improvement in performance: SemGCN and MGCN are improved by 8.9% and 4.5%, respectively.

updated: Sat Aug 19 2023 14:02:46 GMT+0000 (UTC)

published: Tue Feb 28 2023 14:03:40 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト