Partial Is Better Than All: Revisiting Fine-tuning Strategy for Few-shot Learning

Zhiqiang Shen; Zechun Liu; Jie Qin; Marios Savvides; Kwang-Ting Cheng

部分的はすべてよりも優れている：少数のショットの学習のための微調整戦略の再検討

数ショット学習の目標は、ラベル付きの限られたサポートデータから見えないクラスを認識できる分類器を学習することです。このタスクの一般的な方法は、最初にベースセットでモデルをトレーニングしてから、微調整によって新しいクラスに転送することです（ここでの微調整手順は、知識をベースから新しいデータに転送すること、つまり、少数で転送することを学習することとして定義されます。ショットシナリオ。）またはメタ学習。ただし、基本クラスは新規セットと重複しないため、基本データから知識全体を転送するだけでは最適なソリューションではありません。基本モデルの一部の知識は、新規クラスに偏りがあるか、有害でさえある可能性があるためです。この論文では、基本モデルの特定のレイヤーをフリーズまたは微調整することにより、部分的な知識を伝達することを提案します。具体的には、レイヤーが微調整されるように選択された場合、保存された転送可能性の範囲を制御するために、レイヤーに異なる学習率が課されます。再キャストするレイヤーとその学習率の値を決定するために、ターゲットレイヤーの位置を特定し、個々の学習率を決定するのに効率的な進化的検索ベースの方法を導入します。提案手法の有効性を実証するために、CUBとmini-ImageNetで広範な実験を行っています。メタ学習フレームワークと非メタベースのフレームワークの両方で最先端のパフォーマンスを実現します。さらに、従来の事前トレーニング+微調整パラダイムにメソッドを拡張し、一貫した改善を実現します。

The goal of few-shot learning is to learn a classifier that can recognize unseen classes from limited support data with labels. A common practice for this task is to train a model on the base set first and then transfer to novel classes through fine-tuning (Here fine-tuning procedure is defined as transferring knowledge from base to novel data, i.e. learning to transfer in few-shot scenario.) or meta-learning. However, as the base classes have no overlap to the novel set, simply transferring whole knowledge from base data is not an optimal solution since some knowledge in the base model may be biased or even harmful to the novel class. In this paper, we propose to transfer partial knowledge by freezing or fine-tuning particular layer(s) in the base model. Specifically, layers will be imposed different learning rates if they are chosen to be fine-tuned, to control the extent of preserved transferability. To determine which layers to be recast and what values of learning rates for them, we introduce an evolutionary search based method that is efficient to simultaneously locate the target layers and determine their individual learning rates. We conduct extensive experiments on CUB and mini-ImageNet to demonstrate the effectiveness of our proposed method. It achieves the state-of-the-art performance on both meta-learning and non-meta based frameworks. Furthermore, we extend our method to the conventional pre-training + fine-tuning paradigm and obtain consistent improvement.

updated: Mon Feb 08 2021 03:27:05 GMT+0000 (UTC)

published: Mon Feb 08 2021 03:27:05 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト