HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation

Jiefeng Li; Chao Xu; Zhicun Chen; Siyuan Bian; Lixin Yang; Cewu Lu

HybrIK：3D人間の姿勢と形状の推定のためのハイブリッド分析-ニューラルインバースキネマティクスソリューション

モデルベースの3Dポーズおよび形状推定方法は、いくつかのパラメーターを推定することにより、人体の完全な3Dメッシュを再構築します。ただし、抽象的なパラメータの学習は非常に非線形なプロセスであり、画像モデルの不整合に悩まされ、モデルのパフォーマンスが平凡になります。対照的に、3Dキーポイント推定法は、深いCNNネットワークと体積表現を組み合わせてピクセルレベルのローカリゼーション精度を実現しますが、非現実的な体の構造を予測する場合があります。この論文では、ボディメッシュ推定と3Dキーポイント推定の間のギャップを埋めることによって上記の問題に対処します。新しいハイブリッドインバースキネマティクスソリューション（HybrIK）を提案します。 HybrIKは、ツイストアンドスイング分解を介して、正確な3Dジョイントを相対的なボディパーツの回転に直接変換し、3Dボディメッシュを再構築します。スイング回転は3Dジョイントで分析的に解決され、ツイスト回転はニューラルネットワークを介した視覚的な手がかりから導き出されます。 HybrIKは、3Dポーズの精度とパラメトリック人間モデルのリアルな体構造の両方を維持し、純粋な3Dキーポイント推定方法よりもピクセル整列された3Dボディメッシュとより正確な3Dポーズをもたらすことを示します。ベルやホイッスルがない場合、提案された方法は、さまざまな3D人間のポーズと形状のベンチマークで、最先端の方法を大幅に上回ります。説明に役立つ例として、HybrIKは、3DPWデータセットで13.2 mmMPJPEおよび21.9mmPVEによって以前のすべての方法よりも優れています。私たちのコードはhttps://github.com/Jeff-sjtu/HybrIKで入手できます。

Model-based 3D pose and shape estimation methods reconstruct a full 3D mesh for the human body by estimating several parameters. However, learning the abstract parameters is a highly non-linear process and suffers from image-model misalignment, leading to mediocre model performance. In contrast, 3D keypoint estimation methods combine deep CNN network with the volumetric representation to achieve pixel-level localization accuracy but may predict unrealistic body structure. In this paper, we address the above issues by bridging the gap between body mesh estimation and 3D keypoint estimation. We propose a novel hybrid inverse kinematics solution (HybrIK). HybrIK directly transforms accurate 3D joints to relative body-part rotations for 3D body mesh reconstruction, via the twist-and-swing decomposition. The swing rotation is analytically solved with 3D joints, and the twist rotation is derived from the visual cues through the neural network. We show that HybrIK preserves both the accuracy of 3D pose and the realistic body structure of the parametric human model, leading to a pixel-aligned 3D body mesh and a more accurate 3D pose than the pure 3D keypoint estimation methods. Without bells and whistles, the proposed method surpasses the state-of-the-art methods by a large margin on various 3D human pose and shape benchmarks. As an illustrative example, HybrIK outperforms all the previous methods by 13.2 mm MPJPE and 21.9 mm PVE on 3DPW dataset. Our code is available at https://github.com/Jeff-sjtu/HybrIK.

updated: Mon Apr 05 2021 13:57:49 GMT+0000 (UTC)

published: Mon Nov 30 2020 10:32:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト