OmniPose: A Multi-Scale Framework for Multi-Person Pose Estimation

Bruno Artacho; Andreas Savakis

OmniPose：複数人のポーズ推定のためのマルチスケールフレームワーク

私たちは、複数人のポーズ推定のための最先端の結果を達成する、シングルパスのエンドツーエンドのトレーニング可能なフレームワークであるOmniPoseを提案します。 OmniPoseアーキテクチャは、新しいウォーターフォールモジュールを使用して、後処理を必要とせずに、バックボーンフィーチャエクストラクタの有効性を高めるマルチスケールフィーチャ表現を活用します。 OmniPoseは、スケール全体のコンテキスト情報と、マルチスケール特徴抽出器でのガウスヒートマップ変調による関節のローカリゼーションを組み込んで、最先端の精度で人間のポーズを推定します。 OmniPoseの改良されたウォーターフォールモジュールによって取得されたマルチスケール表現は、空間ピラミッド構成に匹敵するマルチスケール視野を維持しながら、カスケードアーキテクチャのプログレッシブフィルタリングの効率を活用します。複数のデータセットに関する私たちの結果は、改善されたHRNetバックボーンとウォーターフォールモジュールを備えたOmniPoseが、最先端の結果を達成する複数人のポーズ推定のための堅牢で効率的なアーキテクチャであることを示しています。

We propose OmniPose, a single-pass, end-to-end trainable framework, that achieves state-of-the-art results for multi-person pose estimation. Using a novel waterfall module, the OmniPose architecture leverages multi-scale feature representations that increase the effectiveness of backbone feature extractors, without the need for post-processing. OmniPose incorporates contextual information across scales and joint localization with Gaussian heatmap modulation at the multi-scale feature extractor to estimate human pose with state-of-the-art accuracy. The multi-scale representations, obtained by the improved waterfall module in OmniPose, leverage the efficiency of progressive filtering in the cascade architecture, while maintaining multi-scale fields-of-view comparable to spatial pyramid configurations. Our results on multiple datasets demonstrate that OmniPose, with an improved HRNet backbone and waterfall module, is a robust and efficient architecture for multi-person pose estimation that achieves state-of-the-art results.

updated: Thu Mar 18 2021 11:30:31 GMT+0000 (UTC)

published: Thu Mar 18 2021 11:30:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト