PedHunter: Occlusion Robust Pedestrian Detector in Crowded Scenes

Cheng Chi; Shifeng Zhang; Junliang Xing; Zhen Lei; Stan Z. Li; Xudong Zou

PedHunter：混雑したシーンでのロバストな歩行者検出器

混雑したシーンでの歩行者検出は、さまざまな歩行者の間で頻繁にオクルージョンが発生するため、困難な問題です。本論文では、群衆シーンで歩行者を狩るための効果的かつ効率的な検出ネットワークを提案します。提案された方法、つまりPedHunterは、推論段階で余分な計算を行うことなく、既存の領域ベースの検出ネットワークに強力なオクルージョン処理能力を導入します。具体的には、マスクガイドモジュールを設計して、ヘッド情報を活用して、バックボーンネットワークの機能表現学習を強化します。さらに、トレーニング中の陽性サンプルの品質を改善して、混雑したシーンでの歩行者検出の一般的な誤検知を排除することにより、厳格な分類基準を開発します。さらに、オクルージョンの堅牢性を向上させるために、オクルージョンサンプルのパターンと量を強化するオクルージョンシミュレーションデータ拡張を提示します。その結果、CityPersons、Caltech-USA、CrowdHumanを含む3つの歩行者検出データセットで最先端の結果を達成しています。監視シーンでの閉塞歩行者検出のさらなる研究を促進するために、SUR-PEDと呼ばれる新しい歩行者データセットをリリースします。これは、10,000個の画像に合計162k以上の高品質の手動ラベル付きインスタンスがあります。提案されたデータセット、ソースコード、および訓練されたモデルがリリースされます。

Pedestrian detection in crowded scenes is a challenging problem, because occlusion happens frequently among different pedestrians. In this paper, we propose an effective and efficient detection network to hunt pedestrians in crowd scenes. The proposed method, namely PedHunter, introduces strong occlusion handling ability to existing region-based detection networks without bringing extra computations in the inference stage. Specifically, we design a mask-guided module to leverage the head information to enhance the feature representation learning of the backbone network. Moreover, we develop a strict classification criterion by improving the quality of positive samples during training to eliminate common false positives of pedestrian detection in crowded scenes. Besides, we present an occlusion-simulated data augmentation to enrich the pattern and quantity of occlusion samples to improve the occlusion robustness. As a consequent, we achieve state-of-the-art results on three pedestrian detection datasets including CityPersons, Caltech-USA and CrowdHuman. To facilitate further studies on the occluded pedestrian detection in surveillance scenes, we release a new pedestrian dataset, called SUR-PED, with a total of over 162k high-quality manually labeled instances in 10k images. The proposed dataset, source codes and trained models will be released.

updated: Sun Sep 15 2019 16:02:25 GMT+0000 (UTC)

published: Sun Sep 15 2019 16:02:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト