Effective Whole-body Pose Estimation with Two-stages Distillation

Zhendong Yang; Ailing Zeng; Chun Yuan; Yu Li

二段階蒸留による効果的な全身姿勢推定

全身姿勢推定では、画像内の人体、手、顔、足のキーポイントの位置を特定します。このタスクは、マルチスケールの身体部位、低解像度領域のきめ細かい位置特定、およびデータ不足のため、困難です。一方で、非常に効率的で正確な姿勢推定器を人間中心の理解および生成タスクに幅広く適用することが急務となっています。この研究では、効果と効率を向上させるために、DWPose という名前の全身ポーズ推定器のための 2 段階のポーズ蒸留を紹介します。第 1 段階の蒸留では、教師の中間機能と、目に見えるキーポイントと目に見えないキーポイントの両方を備えた最終ロジットを利用して、生徒をゼロから監督しながら、体重減少戦略を設計します。第 2 段階では、スチューデントモデル自体を抽出して、パフォーマンスをさらに向上させます。以前の自己知識の蒸留とは異なり、この段階では、プラグアンドプレイのトレーニング戦略として、わずか 20% のトレーニング時間で生徒の頭を微調整します。データの制限については、現実のアプリケーション向けの多様な顔の表情や手のジェスチャーを含む UBody データセットを調査します。包括的な実験により、私たちが提案したシンプルでありながら効果的な方法の優位性が示されています。 COCO-WholeBody で新しい最先端のパフォーマンスを実現し、RTMPose-l の全身 AP を 64.8% から 66.5% に大幅に向上させ、AP の 65.3% である RTMPose-x 教師をも上回りました。さまざまな下流作業に対応するために、小型から大型までさまざまなサイズのモデルをシリーズ化しています。コードとモデルは https://github.com/IDEA-Research/DWPose で入手できます。

Whole-body pose estimation localizes the human body, hand, face, and foot keypoints in an image. This task is challenging due to multi-scale body parts, fine-grained localization for low-resolution regions, and data scarcity. Meanwhile, applying a highly efficient and accurate pose estimator to widely human-centric understanding and generation tasks is urgent. In this work, we present a two-stage pose Distillation for Whole-body Pose estimators, named DWPose, to improve their effectiveness and efficiency. The first-stage distillation designs a weight-decay strategy while utilizing a teacher's intermediate feature and final logits with both visible and invisible keypoints to supervise the student from scratch. The second stage distills the student model itself to further improve performance. Different from the previous self-knowledge distillation, this stage finetunes the student's head with only 20% training time as a plug-and-play training strategy. For data limitations, we explore the UBody dataset that contains diverse facial expressions and hand gestures for real-life applications. Comprehensive experiments show the superiority of our proposed simple yet effective methods. We achieve new state-of-the-art performance on COCO-WholeBody, significantly boosting the whole-body AP of RTMPose-l from 64.8% to 66.5%, even surpassing RTMPose-x teacher with 65.3% AP. We release a series of models with different sizes, from tiny to large, for satisfying various downstream tasks. Our codes and models are available at https://github.com/IDEA-Research/DWPose.

updated: Sat Jul 29 2023 03:49:28 GMT+0000 (UTC)

published: Sat Jul 29 2023 03:49:28 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト