AMPose: Alternatively Mixed Global-Local Attention Model for 3D Human Pose Estimation

Hongxin Lin; Yunwei Chiu; Peiyuan Wu

AMPose: 3D 人間の姿勢推定のためのグローバルとローカルの注意の混合モデル

グラフ畳み込みネットワーク (GCN) は、3D 人間の姿勢推定 (HPE) に適用されています。さらに、純粋な変圧器モデルは最近、ビデオベースの方法で有望な結果を示しています。ただし、グローバルな注意によってのみ変換される特徴表現には人間の骨格の関係が欠けているため、単一フレームの方法では、関節間の物理的に接続された関係をモデル化する必要があります。この問題に対処するために、人間の骨格の関節間の物理的に接続されたグローバルな関係を組み合わせて、人間の姿勢推定に向けた新しいアーキテクチャ、つまり AMPose を提案します。提案手法の有効性は、Human3.6M データセットの評価を通じて実証されています。私たちのモデルは、MPI-INF-3DHP でのデータセット間の比較により、より優れた一般化能力も示しています。

The graph convolutional network (GCN) has been applied to 3D human pose estimation (HPE). In addition, the pure transformer model recently shows promising results in the video-based method. However, the single-frame method still needs to model the physically connected relations among joints because the feature representation transformed only by global attention lack the relationships of the human skeleton. To deal with this problem, we propose a novel architecture, namely AMPose, to combine the physically connected and global relations among joints in the human skeleton towards human pose estimation. The effectiveness of our proposed method is demonstrated through evaluation on Human3.6M dataset. Our model also shows better generalization ability by cross-dataset comparison on MPI-INF-3DHP.

updated: Wed Oct 26 2022 14:48:17 GMT+0000 (UTC)

published: Sun Oct 09 2022 10:10:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト