Understanding human visual attention and saliency is an integral part of vision research. In this context, there is an ever-present need for fresh and diverse benchmark datasets, particularly for insight into special use cases like crowded scenes. We contribute to this end by: (1) reviewing the dynamics behind saliency and crowds. (2) using eye tracking to create a dynamic human eye fixation dataset over a new set of crowd videos gathered from the Internet. The videos are annotated into three distinct density levels. (3) Finally, we evaluate state-of-the-art saliency models on our dataset to identify possible improvements for the design and creation of a more robust saliency model.
updated: Wed Oct 09 2019 07:40:19 GMT+0000 (UTC)
published: Mon Oct 07 2019 05:43:49 GMT+0000 (UTC)