Non-binary deep transfer learning for imageclassification

Jo Plested; Xuyang Shen; Tom Gedeon

画像分類のための非バイナリディープトランスファー学習

少数のラベル付きトレーニング例を使用するさまざまなコンピュータービジョンタスクの現在の標準は、ImageNetなどの大規模な画像分類データセットで事前トレーニングされた重みから微調整することです。転移学習と転移学習の方法の適用は、厳密にバイナリになる傾向があります。モデルは事前にトレーニングされているか、事前にトレーニングされていません。モデルを事前トレーニングすると、パフォーマンスが向上するか低下します。後者は負の転送として定義されます。重みを事前にトレーニングされた値に向かって減衰させるL2-SP正則化の適用が適用されるか、すべての重みが0に向かって減衰されます。このペーパーでは、これらの仮定を再検討します。私たちの推奨事項は、最適な結果を達成するための非バイナリアプローチの適用を実証する広範な経験的評価に基づいています。（1）個々のデータセットで最高のパフォーマンスを実現するには、転送するレイヤーの数、レイヤーごとの学習率の違い、L2SPとL2の正則化の組み合わせの違いなど、通常は考慮されないさまざまな転送学習ハイパーパラメーターを注意深く調整する必要があります。（2）ベストプラクティスは、事前にトレーニングされた重みがターゲットデータセットにどの程度適合しているかを測定し、最適なハイパーパラメーターを導くことで実現できます。 L2SPとL2の正則化を組み合わせたり、従来とは異なる微調整ハイパーパラメータ検索を実行したりするなど、非バイナリ転送学習の方法を紹介します。最後に、最適な転移学習ハイパーパラメータを決定するためのヒューリスティックを提案します。非バイナリアプローチを使用することの利点は、従来は転移学習がより困難であったさまざまなタスクでの最先端のパフォーマンスに近づくか、それを超える最終結果によってサポートされます。

The current standard for a variety of computer vision tasks using smaller numbers of labelled training examples is to fine-tune from weights pre-trained on a large image classification dataset such as ImageNet. The application of transfer learning and transfer learning methods tends to be rigidly binary. A model is either pre-trained or not pre-trained. Pre-training a model either increases performance or decreases it, the latter being defined as negative transfer. Application of L2-SP regularisation that decays the weights towards their pre-trained values is either applied or all weights are decayed towards 0. This paper re-examines these assumptions. Our recommendations are based on extensive empirical evaluation that demonstrate the application of a non-binary approach to achieve optimal results. (1) Achieving best performance on each individual dataset requires careful adjustment of various transfer learning hyperparameters not usually considered, including number of layers to transfer, different learning rates for different layers and different combinations of L2SP and L2 regularization. (2) Best practice can be achieved using a number of measures of how well the pre-trained weights fit the target dataset to guide optimal hyperparameters. We present methods for non-binary transfer learning including combining L2SP and L2 regularization and performing non-traditional fine-tuning hyperparameter searches. Finally we suggest heuristics for determining the optimal transfer learning hyperparameters. The benefits of using a non-binary approach are supported by final results that come close to or exceed state of the art performance on a variety of tasks that have traditionally been more difficult for transfer learning.

updated: Mon Jul 19 2021 02:34:38 GMT+0000 (UTC)

published: Mon Jul 19 2021 02:34:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト