The Center of Attention: Center-Keypoint Grouping via Attention for Multi-Person Pose Estimation

Guillem Brasó; Nikita Kister; Laura Leal-Taixé

注意の中心：複数人の姿勢推定のための注意による中心-キーポイントのグループ化

CenterGroupを紹介します。これは、画像内のIDにとらわれないキーポイントと人物中心の予測のセットから人間のポーズを推定するための注意ベースのフレームワークです。私たちのアプローチでは、トランスフォーマーを使用して、検出されたすべてのキーポイントとセンターのコンテキストアウェア埋め込みを取得し、マルチヘッドアテンションを適用して、関節を対応する人物センターに直接グループ化します。ほとんどのボトムアップ方式は、推論時に学習不可能なクラスタリングに依存していますが、CenterGroupは、キーポイント検出器と一緒にエンドツーエンドでトレーニングする完全に微分可能な注意メカニズムを使用します。その結果、私たちの方法は、競合するボトムアップ方法よりも最大2.5倍速い推論時間で最先端のパフォーマンスを実現します。私たちのコードはhttps://github.com/dvl-tum/center-groupで入手できます。

We introduce CenterGroup, an attention-based framework to estimate human poses from a set of identity-agnostic keypoints and person center predictions in an image. Our approach uses a transformer to obtain context-aware embeddings for all detected keypoints and centers and then applies multi-head attention to directly group joints into their corresponding person centers. While most bottom-up methods rely on non-learnable clustering at inference, CenterGroup uses a fully differentiable attention mechanism that we train end-to-end together with our keypoint detector. As a result, our method obtains state-of-the-art performance with up to 2.5x faster inference time than competing bottom-up methods. Our code is available at https://github.com/dvl-tum/center-group .

updated: Mon Oct 11 2021 10:22:04 GMT+0000 (UTC)

published: Mon Oct 11 2021 10:22:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト