Dense Point Prediction: A Simple Baseline for Crowd Counting and Localization

Yi Wang; Xinyu Hou; Lap-Pui Chau

密集点の予測：群集のカウントとローカリゼーションの単純なベースライン

この論文では、SCALNetという名前のシンプルで効果的な群集カウントおよびローカリゼーションネットワークを提案します。カウントタスクとローカリゼーションタスクを分離するほとんどの既存の作業とは異なり、これらのタスクをピクセル単位の密な予測問題と見なし、エンドツーエンドのフレームワークに統合します。具体的には、群集のカウントには、平均二乗誤差（MSE）損失によって監視されるカウントヘッドを採用しています。群集のローカリゼーションの場合、重要な洞察は、人々のキーポイント、つまり頭の中心点を認識することです。 2つの損失関数、つまり、負の抑制焦点（NSF）損失と偽陽性（FP）損失によってトレーニングされた密集した群集を区別するためのローカリゼーションヘッドを提案します。これは、正/負の例のバランスを取り、偽陽性の予測を処理します。最近の大規模なベンチマークであるNWPU-Crowdでの実験では、私たちのアプローチは、群集のローカリゼーションとカウントタスクでそれぞれ5％と10％以上の改善により、最先端の方法よりも優れていることが示されています。コードはhttps://github.com/WangyiNTU/SCALNetで公開されています。

In this paper, we propose a simple yet effective crowd counting and localization network named SCALNet. Unlike most existing works that separate the counting and localization tasks, we consider those tasks as a pixel-wise dense prediction problem and integrate them into an end-to-end framework. Specifically, for crowd counting, we adopt a counting head supervised by the Mean Square Error (MSE) loss. For crowd localization, the key insight is to recognize the keypoint of people, i.e., the center point of heads. We propose a localization head to distinguish dense crowds trained by two loss functions, i.e., Negative-Suppressed Focal (NSF) loss and False-Positive (FP) loss, which balances the positive/negative examples and handles the false-positive predictions. Experiments on the recent and large-scale benchmark, NWPU-Crowd, show that our approach outperforms the state-of-the-art methods by more than 5% and 10% improvement in crowd localization and counting tasks, respectively. The code is publicly available at https://github.com/WangyiNTU/SCALNet.

updated: Mon Apr 26 2021 12:08:08 GMT+0000 (UTC)

published: Mon Apr 26 2021 12:08:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト