Real-time single-stage object detectors based on deep learning still remain less accurate than more complex ones. The trade-off between model performance and computational speed is a major challenge. In this paper, we propose a new way to efficiently learn a single-shot detector which offers a very good compromise between these two objectives. To this end, we introduce LapNet, an anchor based detector, trained end-to-end without any sampling strategy. Our approach aims to overcome two important problems encountered in training an anchor based detector: (1) ambiguity in the assignment of anchor to ground truth and (2) class and object size imbalance. To address the first limitation, we propose a soft positive/negative anchor assignment procedure based on a new overlapping function called "Per-Object Normalized Overlap" (PONO). This soft assignment can be self-corrected by the network itself to avoid ambiguity between close objects. To cope with the second limitation, we propose to learn additional weights, that are not used at inference, to efficiently manage sample imbalance. These two contributions make the detector learning more generic whatever the training dataset. Various experiments show the effectiveness of the proposed approach.
updated: Tue Mar 17 2020 15:18:17 GMT+0000 (UTC)
published: Mon Nov 04 2019 12:13:07 GMT+0000 (UTC)