Online Knowledge Distillation for Efficient Pose Estimation

Zheng Li; Jingwen Ye; Mingli Song; Ying Huang; Zhigeng Pan

効率的なポーズ推定のためのオンライン知識蒸留

既存の最先端の人間の姿勢推定方法は、正確な予測のために大量の計算リソースを必要とします。正確でありながら軽量なポーズ推定器を取得するための有望な手法の1つは、知識の蒸留です。これは、ポーズの知識を強力な教師モデルからパラメータ化されていない学生モデルに抽出します。ただし、既存のポーズ蒸留作業は、知識の伝達を実行するために事前に訓練された重い推定量に依存しており、複雑な2段階の学習手順を必要とします。この作業では、OKDHPと呼ばれる蒸留効率を保証するために、人間のポーズ構造の知識を1段階で蒸留することにより、新しいオンライン知識蒸留フレームワークを調査します。具体的には、OKDHPは、単一のマルチブランチネットワークをトレーニングし、それぞれから予測ヒートマップを取得します。これらのヒートマップは、ターゲットヒートマップとして機能集約ユニット（FAU）によってアセンブルされ、各ブランチに逆方向にティーチします。ヒートマップを単純に平均化する代わりに、異なる受容野を持つ複数の並列変換で構成されるFAUは、マルチスケール情報を活用して、より高品質のターゲットヒートマップを取得します。具体的には、ピクセル単位のカルバックライブラー（KL）発散を利用して、ターゲットヒートマップと予測ヒートマップの間の不一致を最小限に抑えます。これにより、学生ネットワークは暗黙のキーポイント関係を学習できます。さらに、さまざまな圧縮率で学生ネットワークをカスタマイズするために、不均衡なOKDHPスキームが導入されています。私たちのアプローチの有効性は、2つの一般的なベンチマークデータセット、MPIIとCOCOでの広範な実験によって実証されています。

Existing state-of-the-art human pose estimation methods require heavy computational resources for accurate predictions. One promising technique to obtain an accurate yet lightweight pose estimator is knowledge distillation, which distills the pose knowledge from a powerful teacher model to a less-parameterized student model. However, existing pose distillation works rely on a heavy pre-trained estimator to perform knowledge transfer and require a complex two-stage learning procedure. In this work, we investigate a novel Online Knowledge Distillation framework by distilling Human Pose structure knowledge in a one-stage manner to guarantee the distillation efficiency, termed OKDHP. Specifically, OKDHP trains a single multi-branch network and acquires the predicted heatmaps from each, which are then assembled by a Feature Aggregation Unit (FAU) as the target heatmaps to teach each branch in reverse. Instead of simply averaging the heatmaps, FAU which consists of multiple parallel transformations with different receptive fields, leverages the multi-scale information, thus obtains target heatmaps with higher-quality. Specifically, the pixel-wise Kullback-Leibler (KL) divergence is utilized to minimize the discrepancy between the target heatmaps and the predicted ones, which enables the student network to learn the implicit keypoint relationship. Besides, an unbalanced OKDHP scheme is introduced to customize the student networks with different compression rates. The effectiveness of our approach is demonstrated by extensive experiments on two common benchmark datasets, MPII and COCO.

updated: Wed Aug 04 2021 14:49:44 GMT+0000 (UTC)

published: Wed Aug 04 2021 14:49:44 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト