This paper provides a comprehensive and exhaustive study of adversarial attacks on human pose estimation models and the evaluation of their robustness. Besides highlighting the important differences between well-studied classification and human pose-estimation systems w.r.t. adversarial attacks, we also provide deep insights into the design choices of pose-estimation systems to shape future work. We benchmark the robustness of several 2D single person pose-estimation architectures trained on multiple datasets, MPII and COCO. In doing so, we also explore the problem of attacking non-classification networks including regression based networks, which has been virtually unexplored in the past. \par We find that compared to classification and semantic segmentation, human pose estimation architectures are relatively robust to adversarial attacks with the single-step attacks being surprisingly ineffective. Our study shows that the heatmap-based pose-estimation models are notably robust than their direct regression-based systems and that the systems which explicitly model anthropomorphic semantics of human body fare better than their other counterparts. Besides, targeted attacks are more difficult to obtain than un-targeted ones and some body-joints are easier to fool than the others. We present visualizations of universal perturbations to facilitate unprecedented insights into their workings on pose-estimation. Additionally, we show them to generalize well across different networks. Finally we perform a user study about perceptibility of these examples.
updated: Thu Jun 10 2021 05:27:07 GMT+0000 (UTC)
published: Sun Aug 18 2019 09:04:26 GMT+0000 (UTC)