Fixing the train-test resolution discrepancy

Hugo Touvron; Andrea Vedaldi; Matthijs Douze; Hervé Jégou

訓練-テストの解像度の不一致の修正

画像分類のためのニューラルネットワークのトレーニングでは、データの補強が鍵となる。この論文ではまず、既存の拡張が、訓練時とテスト時に分類器が見るオブジェクトの典型的なサイズの間に大きな不一致を引き起こすことを示す。また、目標とするテスト解像度に対して、訓練解像度を低くすることで、テスト時の分類が向上することを実験的に検証した。そして、訓練解像度とテスト解像度が異なる場合に分類器の性能を最適化するための、シンプルで効果的かつ効率的な戦略を提案する。必要なのはテスト解像度でのネットワークのファインチューニングを少ない計算量で行うことのみである。これにより、小さな学習画像を用いて強力な分類器を訓練することができる．例えば、128x128の画像で学習したResNet-50では、ImageNetで77.1%のトップ1精度が得られ、224x224の画像で学習したResNet-50では79.8%のトップ1精度が得られた。さらに、追加の学習データを使用した場合、224x224の画像で学習したResNet-50では82.5%の精度が得られた。逆に、ResNeXt-101 32x48dを9億4000万枚の公開画像を解像度224x224で弱教師付きで事前学習し、さらにテスト解像度320x320で最適化した場合、テストのトップ1精度は86.4%(トップ5: 98.0%) (シングルクロップ)となる。これは、我々の知る限りでは、ImageNetのシングルクロップ、top-1、top-5の精度としては、これまでで最高のものとなっている。

Data-augmentation is key to the training of neural networks for image classification. This paper first shows that existing augmentations induce a significant discrepancy between the typical size of the objects seen by the classifier at train and test time. We experimentally validate that, for a target test resolution, using a lower train resolution offers better classification at test time. We then propose a simple yet effective and efficient strategy to optimize the classifier performance when the train and test resolutions differ. It involves only a computationally cheap fine-tuning of the network at the test resolution. This enables training strong classifiers using small training images. For instance, we obtain 77.1% top-1 accuracy on ImageNet with a ResNet-50 trained on 128x128 images, and 79.8% with one trained on 224x224 image. In addition, if we use extra training data we get 82.5% with the ResNet-50 train with 224x224 images. Conversely, when training a ResNeXt-101 32x48d pre-trained in weakly-supervised fashion on 940 million public images at resolution 224x224 and further optimizing for test resolution 320x320, we obtain a test top-1 accuracy of 86.4% (top-5: 98.0%) (single-crop). To the best of our knowledge this is the highest ImageNet single-crop, top-1 and top-5 accuracy to date.

updated: Thu Jan 20 2022 11:02:01 GMT+0000 (UTC)

published: Fri Jun 14 2019 22:27:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト