PoP-Net: Pose over Parts Network for Multi-Person 3D Pose Estimation from a Depth Image

Yuliang Guo; Zhong Li; Zekun Li; Xiangyu Du; Shuxue Quan; Yi Xu

PoP-Net：深度画像からの複数人の3Dポーズ推定のためのパーツネットワーク上のポーズ

本論文では、PoP-Netと呼ばれるリアルタイム手法を提案して、深度画像から複数人の3Dポーズを予測します。 PoP-Netは、ボトムアップのパーツ表現とトップダウンのグローバルポーズを1回のショットで予測することを学習します。具体的には、切り捨てられたパーツ変位フィールド（TPDF）と呼ばれる新しいパーツレベルの表現が導入されました。これにより、明示的な融合プロセスにより、ボトムアップパーツ検出とグローバルポーズ検出の利点を統合できます。一方、効果的なモード選択スキームが導入され、グローバルポーズとパーツ検出の間の競合するケースが自動的に解決されます。最後に、複数人の3Dポーズ推定を開発するための高品質の深度データセットがないため、新しいベンチマークとして複数人の3D人間ポーズデータセット（MP-3DHP）を導入します。 MP-3DHPは、モデルトレーニングで効果的な複数人およびバックグラウンドデータの拡張を可能にし、制御されていない複数人のシナリオで3D人間のポーズ推定量を評価するように設計されています。 PoP-Netは、MP-3DHPと広く使用されているITOPデータセットの両方で最先端の結果を達成し、複数人の処理の効率に大きな利点があることを示しています。アルゴリズムパイプラインのアプリケーションの1つを示すために、計算された3D関節位置によって駆動される仮想アバターの結果も示します。 MP-3DHPデータセットと評価コードはhttps://github.com/oppo-us-research/PoP-Netで入手できます。

In this paper, a real-time method called PoP-Net is proposed to predict multi-person 3D poses from a depth image. PoP-Net learns to predict bottom-up part representations and top-down global poses in a single shot. Specifically, a new part-level representation, called Truncated Part Displacement Field (TPDF), is introduced which enables an explicit fusion process to unify the advantages of bottom-up part detection and global pose detection. Meanwhile, an effective mode selection scheme is introduced to automatically resolve the conflicting cases between global pose and part detections. Finally, due to the lack of high-quality depth datasets for developing multi-person 3D pose estimation, we introduce Multi-Person 3D Human Pose Dataset (MP-3DHP) as a new benchmark. MP-3DHP is designed to enable effective multi-person and background data augmentation in model training, and to evaluate 3D human pose estimators under uncontrolled multi-person scenarios. We show that PoP-Net achieves the state-of-the-art results both on MP-3DHP and on the widely used ITOP dataset, and has significant advantages in efficiency for multi-person processing. To demonstrate one of the applications of our algorithm pipeline, we also show results of virtual avatars driven by our calculated 3D joint positions. MP-3DHP Dataset and the evaluation code have been made available at: https://github.com/oppo-us-research/PoP-Net.

updated: Thu Nov 25 2021 01:10:34 GMT+0000 (UTC)

published: Sat Dec 12 2020 05:32:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト