A Gated Attention Transformer for Multi-Person Pose Tracking

Andreas Doering; Juergen Gall

複数人のポーズ追跡のためのゲート付きアテンショントランスフォーマー

複数人のポーズの追跡は、多くのアプリケーションにとって重要な要素であり、ビデオ内のすべての人の人間のポーズを推定し、それらを経時的に追跡する必要があります。フレーム間でのポーズの関連付けは、モーションブラー、混雑したシーン、オクルージョンなどにより、特にオンライントラッキング方法において未解決の研究問題として残されています。アソシエーションの課題に取り組むために、私たちはゲートされた注意トランスフォーマーを提案します。私たちのモデルの中核となるのは、アテンションレイヤーの時間的な姿勢の類似性に基づいて、外観の埋め込みと埋め込みの影響を自動的に適応させるゲートメカニズムです。オクルージョンされた人物を再識別するために、最初の埋め込みを提供し、フレーム間で目に見える関節の数が異なる場合でも人物を照合できる姿勢条件付き再識別ネットワークを組み込みます。さらに、ポーズとトラックの関連付けと重複の削除のためのゲートされた注意に基づくマッチングレイヤーを提案します。 PoseTrack 2018 と PoseTrack21 でのアプローチを評価します。

Multi-person pose tracking is an important element for many applications and requires to estimate the human poses of all persons in a video and to track them over time. The association of poses across frames remains an open research problem, in particular for online tracking methods, due to motion blur, crowded scenes and occlusions. To tackle the association challenge, we propose a Gated Attention Transformer. The core aspect of our model is the gating mechanism that automatically adapts the impact of appearance embeddings and embeddings based on temporal pose similarity in the attention layers. In order to re-identify persons that have been occluded, we incorporate a pose-conditioned re-identification network that provides initial embeddings and allows to match persons even if the number of visible joints differ between frames. We further propose a matching layer based on gated attention for pose-to-track association and duplicate removal. We evaluate our approach on PoseTrack 2018 and PoseTrack21.

updated: Mon Aug 21 2023 17:45:29 GMT+0000 (UTC)

published: Fri Jun 09 2023 10:44:44 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト