Variance-Covariance Regularization Improves Representation Learning

Jiachen Zhu; Ravid Shwartz-Ziv; Yubei Chen; Yann LeCun

分散共分散正則化により表現学習が改善される

転移学習は、機械学習ドメインの主要なアプローチとして浮上しており、1 つのドメインから得られた知識を適用して、後続のタスクのパフォーマンスを向上させることができます。これらの後続のタスクに関する情報が限られていることが多いため、強力な転移学習アプローチでは、初期の事前トレーニング段階でモデルがさまざまな範囲の特徴を捕捉する必要があります。ただし、最近の研究では、十分な正則化がないと、ネットワークは主に事前学習損失関数を削減する機能に集中する傾向があることが示唆されています。この傾向により、特徴学習が不十分になり、ターゲットタスクの汎化能力が損なわれる可能性があります。この問題に対処するために、学習されたネットワーク特徴の多様性を促進することを目的とした正則化手法である分散共分散正則化 (VCR) を提案します。自己教師あり学習アプローチの最近の進歩からインスピレーションを得た私たちのアプローチは、高い分散と最小の共分散を示す学習表現を促進し、ネットワークが損失低減機能のみに焦点を当てるのを防ぎます。私たちは、学習された表現に関する詳細な分析研究と組み合わせた包括的な実験を通じて、私たちの方法の有効性を経験的に検証します。さらに、私たちは、メソッドに関連する計算オーバーヘッドを最小限に抑える効率的な実装戦略を開発します。私たちの結果は、VCR が教師あり学習と自己教師あり学習の両方の転移学習のパフォーマンスを向上させる強力かつ効率的な方法であることを示しており、この分野での将来の研究に新たな可能性をもたらします。

Transfer learning has emerged as a key approach in the machine learning domain, enabling the application of knowledge derived from one domain to improve performance on subsequent tasks. Given the often limited information about these subsequent tasks, a strong transfer learning approach calls for the model to capture a diverse range of features during the initial pretraining stage. However, recent research suggests that, without sufficient regularization, the network tends to concentrate on features that primarily reduce the pretraining loss function. This tendency can result in inadequate feature learning and impaired generalization capability for target tasks. To address this issue, we propose Variance-Covariance Regularization (VCR), a regularization technique aimed at fostering diversity in the learned network features. Drawing inspiration from recent advancements in the self-supervised learning approach, our approach promotes learned representations that exhibit high variance and minimal covariance, thus preventing the network from focusing solely on loss-reducing features. We empirically validate the efficacy of our method through comprehensive experiments coupled with in-depth analytical studies on the learned representations. In addition, we develop an efficient implementation strategy that assures minimal computational overhead associated with our method. Our results indicate that VCR is a powerful and efficient method for enhancing transfer learning performance for both supervised learning and self-supervised learning, opening new possibilities for future research in this domain.

updated: Fri Jun 23 2023 05:01:02 GMT+0000 (UTC)

published: Fri Jun 23 2023 05:01:02 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト