Hybrid model for Single-Stage Multi-Person Pose Estimation

Jonghyun Kim; Bosang Kim; Hyotae Lee; Jungpyo Kim; Wonhyeok Im; Lanying Jin; Dowoo Kwon; Jungho Lee

単一段階の複数人の姿勢推定のためのハイブリッドモデル

一般に、人間の姿勢推定方法は、そのアーキテクチャに従って、回帰 (つまり、ヒートマップを使用しない) とヒートマップに基づく方法の 2 つのアプローチに分類されます。前者は、畳み込みレイヤーと全結合レイヤーを使用して、各キーポイントの正確な座標を直接推定します。このアプローチでは、重複した密集したキーポイントを検出できますが、シーン内に存在しないキーポイントによって予期しない結果が得られる可能性があります。一方、後者は、各キーポイントの予測ヒートマップを利用して、存在しないものを除外できます。それにもかかわらず、ヒートマップからキーポイント座標を取得するときに量子化エラーが発生します。さらに、回帰のものとは異なり、画像内に密集して配置されたキーポイントを区別することは困難です。この目的のために、HybridPose という名前の単一段階の複数人の姿勢推定のハイブリッドモデルを提案します。これは、両方のアプローチの長所を最大化することによって、両方のアプローチの各欠点を相互に克服します。さらに、自己相関損失を導入して、キーポイント座標とその可視性の間に空間依存性を挿入します。したがって、HybridPose は、密集して配置されたキーポイントを検出するだけでなく、画像内に存在しないキーポイントをフィルター処理することもできます。実験結果は、提案された HybridPose がポーズ推定精度の点でパフォーマンスを低下させることなくキーポイントの可視性を示すことを示しています。

In general, human pose estimation methods are categorized into two approaches according to their architectures: regression (i.e., heatmap-free) and heatmap-based methods. The former one directly estimates precise coordinates of each keypoint using convolutional and fully-connected layers. Although this approach is able to detect overlapped and dense keypoints, unexpected results can be obtained by non-existent keypoints in a scene. On the other hand, the latter one is able to filter the non-existent ones out by utilizing predicted heatmaps for each keypoint. Nevertheless, it suffers from quantization error when obtaining the keypoint coordinates from its heatmaps. In addition, unlike the regression one, it is difficult to distinguish densely placed keypoints in an image. To this end, we propose a hybrid model for single-stage multi-person pose estimation, named HybridPose, which mutually overcomes each drawback of both approaches by maximizing their strengths. Furthermore, we introduce self-correlation loss to inject spatial dependencies between keypoint coordinates and their visibility. Therefore, HybridPose is capable of not only detecting densely placed keypoints, but also filtering the non-existent keypoints in an image. Experimental results demonstrate that proposed HybridPose exhibits the keypoints visibility without performance degradation in terms of the pose estimation accuracy.

updated: Tue May 02 2023 02:55:29 GMT+0000 (UTC)

published: Tue May 02 2023 02:55:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト