GraphMLP: A Graph MLP-Like Architecture for 3D Human Pose Estimation

Wenhao Li; Hong Liu; Tianyu Guo; Hao Tang; Runwei Ding

GraphMLP：3D人間のポーズ推定のためのグラフMLPのようなアーキテクチャ

最新の多層パーセプトロン（MLP）モデルは、自己注意なしに視覚的表現を学習することで競争力のある結果を示しています。ただし、既存のMLPモデルは、局所的な詳細をキャプチャするのが得意ではなく、人間の構成に関する事前の知識が不足しているため、骨格表現学習のモデリング能力が制限されます。これらの問題に対処するために、3D人間の姿勢推定のためのグローバル-ローカル-グラフィック統合アーキテクチャでMLPとグラフ畳み込みネットワーク（GCN）を組み合わせた、GraphMLPという名前のシンプルで効果的なグラフ強化MLPのようなアーキテクチャを提案します。 GraphMLPは、人体のグラフ構造をMLPモデルに組み込んで、ドメイン固有の需要を満たすと同時に、ローカルとグローバルの両方の空間的相互作用を可能にします。広範な実験により、提案されたGraphMLPは、Human3.6MとMPI-INF-3DHPの2つのデータセットで最先端のパフォーマンスを達成することが示されています。ソースコードと事前トレーニング済みモデルが公開されます。

Modern multi-layer perceptron (MLP) models have shown competitive results in learning visual representations without self-attention. However, existing MLP models are not good at capturing local details and lack prior knowledge of human configurations, which limits their modeling power for skeletal representation learning. To address these issues, we propose a simple yet effective graph-reinforced MLP-Like architecture, named GraphMLP, that combines MLPs and graph convolutional networks (GCNs) in a global-local-graphical unified architecture for 3D human pose estimation. GraphMLP incorporates the graph structure of human bodies into an MLP model to meet the domain-specific demand while also allowing for both local and global spatial interactions. Extensive experiments show that the proposed GraphMLP achieves state-of-the-art performance on two datasets, i.e., Human3.6M and MPI-INF-3DHP. Our source code and pretrained models will be publicly available.

updated: Mon Jun 13 2022 18:59:31 GMT+0000 (UTC)

published: Mon Jun 13 2022 18:59:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト