Revisiting the Transferability of Supervised Pretraining: an MLP Perspective

Yizhou Wang; Shixiang Tang; Feng Zhu; Lei Bai; Rui Zhao; Donglian Qi; Wanli Ouyang

教師あり事前トレーニングの転送可能性の再検討：MLPの観点

pretrain-finetuneパラダイムは、視覚学習における古典的なパイプラインです。教師なし事前トレーニング方法の最近の進歩は、教師なしの対応する方法よりも優れた転送パフォーマンスを示しています。このホワイトペーパーでは、この現象を再検討し、多層パーセプトロン（MLP）の観点から教師なし事前トレーニングと教師あり事前トレーニングの間の伝達可能性のギャップを理解することに新たな光を当てます。以前の作業は、教師なし画像分類でのMLPの有効性に焦点を当てており、事前トレーニングと評価が同じデータセットで実行されますが、MLPプロジェクターは、教師あり事前トレーニング方法よりも教師なし事前トレーニング方法の転送可能性を高めるための重要な要素でもあることを明らかにします。この観察に基づいて、教師あり事前トレーニングの分類子の前にMLPプロジェクターを追加することにより、教師あり事前トレーニングと教師なし事前トレーニングの間の転送可能性のギャップを埋めようとします。私たちの分析によると、MLPプロジェクターは、視覚的特徴のクラス内変動を保持し、事前トレーニングと評価データセット間の特徴分布距離を短縮し、特徴の冗長性を減らすのに役立ちます。公開ベンチマークでの広範な実験は、追加されたMLPプロジェクターが、教師あり事前トレーニングの転送可能性を大幅に向上させることを示しています。たとえば、概念一般化タスクで+ 7.2％のトップ1精度、12ドメイン分類タスクで線形評価で+ 5.8％トップ1精度、 COCOオブジェクト検出タスクのAPが+ 0.8％であるため、教師あり事前トレーニングは、教師なし事前トレーニングと同等か、それよりも優れています。コードは承認時にリリースされます。

The pretrain-finetune paradigm is a classical pipeline in visual learning. Recent progress on unsupervised pretraining methods shows superior transfer performance to their supervised counterparts. This paper revisits this phenomenon and sheds new light on understanding the transferability gap between unsupervised and supervised pretraining from a multilayer perceptron (MLP) perspective. While previous works focus on the effectiveness of MLP on unsupervised image classification where pretraining and evaluation are conducted on the same dataset, we reveal that the MLP projector is also the key factor to better transferability of unsupervised pretraining methods than supervised pretraining methods. Based on this observation, we attempt to close the transferability gap between supervised and unsupervised pretraining by adding an MLP projector before the classifier in supervised pretraining. Our analysis indicates that the MLP projector can help retain intra-class variation of visual features, decrease the feature distribution distance between pretraining and evaluation datasets, and reduce feature redundancy. Extensive experiments on public benchmarks demonstrate that the added MLP projector significantly boosts the transferability of supervised pretraining, e.g. +7.2% top-1 accuracy on the concept generalization task, +5.8% top-1 accuracy for linear evaluation on 12-domain classification tasks, and +0.8% AP on COCO object detection task, making supervised pretraining comparable or even better than unsupervised pretraining. Codes will be released upon acceptance.

updated: Wed Dec 01 2021 13:47:30 GMT+0000 (UTC)

published: Wed Dec 01 2021 13:47:30 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト