Human keypoint detection from a single image is very challenging due to occlusion, blur, illumination and scale variance of person instances. In this paper, we find that context information plays an important role in addressing these issues, and propose a novel method named progressive context refinement (PCR) for human keypoint detection. First, we devise a simple but effective context-aware module (CAM) that can efficiently integrate spatial and channel context information to aid feature learning for locating hard keypoints. Then, we construct the PCR model by stacking several CAMs sequentially with shortcuts and employ multi-task learning to progressively refine the context information and predictions. Besides, to maximize PCR's potential for the aforementioned hard case inference, we propose a hard-negative person detection mining strategy together with a joint-training strategy by exploiting the unlabeled coco dataset and external dataset. Extensive experiments on the COCO keypoint detection benchmark demonstrate the superiority of PCR over representative state-of-the-art (SOTA) methods. Our single model achieves comparable performance with the winner of the 2018 COCO Keypoint Detection Challenge. The final ensemble model sets a new SOTA on this benchmark.
updated: Sun Oct 27 2019 09:57:08 GMT+0000 (UTC)
published: Sun Oct 27 2019 09:57:08 GMT+0000 (UTC)