We seek to improve crowd counting as we perceive limits of currently prevalent density map estimation approach on both prediction accuracy and time efficiency. We leverage multilevel pixelation of density map as it helps improve SNR of training data and therefore, reduce prediction error. To achieve a better model, we introduce multilayer gradient fusion for training a density-aware global count regressor. More specifically, on training stage, a backbone network receives gradients from multiple branches to learn the density information, whereas those branches are to be detached to accelerate inference. By taking advantages of such method, our model improves benchmark results on public datasets and exhibits itself to be a new solution to crowd counting problems in practice.
updated: Sun Aug 02 2020 02:02:05 GMT+0000 (UTC)
published: Fri Aug 09 2019 04:44:13 GMT+0000 (UTC)