The Low-Rank Simplicity Bias in Deep Networks

Minyoung Huh; Hossein Mobahi; Richard Zhang; Brian Cheung; Pulkit Agrawal; Phillip Isola

ディープネットワークにおける低ランクの単純さのバイアス

現代のディープニューラルネットワークは、トレーニングされたデータと比較して非常にパラメーター化されていますが、非常によく一般化されていることがよくあります。最近の仕事の急増は尋ねました：なぜ深いネットワークは彼らのトレーニングデータに過剰適合しないのですか？より深いネットは、より低いランクのソリューションを見つけるために暗黙的にバイアスされており、これらはよく一般化されるソリューションであるという仮説を調査します。漸近的なケースでは、線形ニューラルネットワークが深くなるにつれて、有効ランクの低い解の体積パーセントが単調に増加することを証明します。次に、私たちの主張が有限幅モデルに当てはまることを経験的に示します。さらに、同様の結果が非線形ネットワークにも当てはまることが経験的にわかります。より深い非線形ネットワークは、カーネルのランクが低い特徴空間を学習します。さらに、深い非線形モデルの線形オーバーパラメーター化を使用して低ランクのバイアスを誘発し、有効なモデル容量を変更せずに一般化パフォーマンスを向上させる方法を示します。さまざまなモデルアーキテクチャを評価し、線形的に過剰にパラメータ化されたモデルが、ImageNetを含む画像分類タスクの既存のベースラインよりも優れていることを示します。

Modern deep neural networks are highly over-parameterized compared to the data on which they are trained, yet they often generalize remarkably well. A flurry of recent work has asked: why do deep networks not overfit to their training data? We investigate the hypothesis that deeper nets are implicitly biased to find lower rank solutions and that these are the solutions that generalize well. We prove for the asymptotic case that the percent volume of low effective-rank solutions increases monotonically as linear neural networks are made deeper. We then show empirically that our claim holds true on finite width models. We further empirically find that a similar result holds for non-linear networks: deeper non-linear networks learn a feature space whose kernel has a lower rank. We further demonstrate how linear over-parameterization of deep non-linear models can be used to induce low-rank bias, improving generalization performance without changing the effective model capacity. We evaluate on various model architectures and demonstrate that linearly over-parameterized models outperform existing baselines on image classification tasks, including ImageNet.

updated: Thu Mar 18 2021 17:58:02 GMT+0000 (UTC)

published: Thu Mar 18 2021 17:58:02 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト