Transportation systems often rely on understanding the flow of vehicles or pedestrian. From traffic monitoring at the city scale, to commuters in train terminals, recent progress in sensing technology make it possible to use cameras to better understand the demand, i.e., better track moving agents (e.g., vehicles and pedestrians). Whether the cameras are mounted on drones, vehicles, or fixed in the built environments, they inevitably remain scatter. We need to develop the technology to re-identify the same agents across images captured from non-overlapping field-of-views, referred to as the visual re-identification task. State-of-the-art methods learn a neural network based representation trained with the cross-entropy loss function. We argue that such loss function is not suited for the visual re-identification task hence propose to model confidence in the representation learning framework. We show the impact of our confidence-based learning framework with three methods: label smoothing, confidence penalty, and deep variational information bottleneck. They all show a boost in performance validating our claim. Our contribution is generic to any agent of interest, i.e., vehicles or pedestrians, and outperform highly specialized state-of-the-art methods across 5 datasets. The source code and models are shared towards an open science mission.
updated: Thu Sep 10 2020 08:53:43 GMT+0000 (UTC)
published: Tue Jun 11 2019 16:49:27 GMT+0000 (UTC)