A Global to Local Double Embedding Method for Multi-person Pose Estimation

Yiming Xu; Jiaxin Li; Yan Ding; Hua-Liang Wei

複数人のポーズ推定のためのグローバルからローカルへの二重埋め込み方法

複数人のポーズの推定は、多くのコンピュータビジョンタスクにとって基本的で困難な問題です。ほとんどの既存のメソッドは、トップダウンメソッドとボトムアップメソッドの2つのクラスに大まかに分類できます。 2種類の方法はどちらも、人の検出と関節の検出の2つの段階を含みます。従来、2つのステージは、それらの間の相互作用を考慮せずに別々に実装されており、これは必然的に本質的に何らかの問題を引き起こす可能性があります。本論文では、人の検出と関節の検出を同時に実装することにより、パイプラインを簡素化する新しい方法を提示します。グローバルからローカルへの方法で複数人のポーズ推定タスクを完了するために、二重埋め込み（DE）メソッドを提案します。 DEは、グローバル埋め込み（GE）とローカル埋め込み（LE）で構成されます。 GEはさまざまな人物のインスタンスをエンコードし、画像全体をカバーする情報を処理し、LEはローカルの手足の情報をエンコードします。 GEはトップダウン戦略で人を検出するために機能し、LEは残りの関節を順番に接続します。これはボトムアップ戦略での関節のグループ化と情報処理のために機能します。 LEに基づいて、複雑なシナリオでの予測の難しさを軽減するために、Mutual Refine Machine（MRM）を設計します。 MRMは、キーポイント間で通信する情報を効果的に実現し、精度をさらに向上させることができます。ベンチマークMSCOCO、MPII、CrowdPoseで競争力のある結果を達成し、この方法の有効性と一般化能力を実証しています。

Multi-person pose estimation is a fundamental and challenging problem to many computer vision tasks. Most existing methods can be broadly categorized into two classes: top-down and bottom-up methods. Both of the two types of methods involve two stages, namely, person detection and joints detection. Conventionally, the two stages are implemented separately without considering their interactions between them, and this may inevitably cause some issue intrinsically. In this paper, we present a novel method to simplify the pipeline by implementing person detection and joints detection simultaneously. We propose a Double Embedding (DE) method to complete the multi-person pose estimation task in a global-to-local way. DE consists of Global Embedding (GE) and Local Embedding (LE). GE encodes different person instances and processes information covering the whole image and LE encodes the local limbs information. GE functions for the person detection in top-down strategy while LE connects the rest joints sequentially which functions for joint grouping and information processing in A bottom-up strategy. Based on LE, we design the Mutual Refine Machine (MRM) to reduce the prediction difficulty in complex scenarios. MRM can effectively realize the information communicating between keypoints and further improve the accuracy. We achieve the competitive results on benchmarks MSCOCO, MPII and CrowdPose, demonstrating the effectiveness and generalization ability of our method.

updated: Sat Aug 28 2021 05:46:14 GMT+0000 (UTC)

published: Mon Feb 15 2021 03:13:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト