Standard convolutional neural networks(CNNs) require consistent image resolutions in both training and testing phase. However, in practice, testing with smaller image sizes is necessary for fast inference. We show that trivially evaluating low-resolution images on networks trained with high-resolution images results in a catastrophic accuracy drop in standard CNN architectures. We propose a novel training regime called Scale calibrated Training(SCT) which allows networks to learn from various scales of input simultaneously. By taking advantages of SCT, single network can provide decent accuracy at test time in response to multiple test scales. In our analysis, we surprisingly find that vanilla batch normalization can lead to sub-optimal performance in SCT. Therefore, a novel normalization scheme called Scale-Specific Batch Normalization is equipped to SCT in replacement of batch normalization. Experiment results show that SCT improves accuracy of single Resnet-50 on ImageNet by 1.7% and 11.5% accuracy when testing on image sizes of 224 and 128 respectively.
updated: Mon Sep 07 2020 15:09:01 GMT+0000 (UTC)
published: Sat Aug 31 2019 10:01:37 GMT+0000 (UTC)