Over the past few years, researchers have presented many different applications for convolutional neural networks, including those for the detection and recognition of objects from images. The desire to understand our own nature has always been an important motivation for research. Thus, the visual recognition of humans is among the most important issues facing machine learning today. Most solutions for this task have been developed and tested by using several publicly available datasets. These datasets typically contain images taken from street-level closed-circuit television cameras offering a low-angle view. There are major differences between such images and those taken from the sky. In addition, aerial images are often very congested, containing hundreds of targets. These factors may have significant impact on the quality of the results. In this paper, we investigate state-of-the-art methods for counting pedestrians and the related performance of aerial footage. Furthermore, we analyze this performance with respect to the congestion levels of the images.
updated: Tue Nov 05 2019 09:07:35 GMT+0000 (UTC)
published: Tue Nov 05 2019 09:07:35 GMT+0000 (UTC)