The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers

Preetum Nakkiran; Behnam Neyshabur; Hanie Sedghi

ディープブートストラップフレームワーク：優れたオンライン学習者は優れたオフラインジェネラライザーです

深層学習における一般化について推論するための新しいフレームワークを提案します。中心的なアイデアは、オプティマイザーが経験的損失に対して確率的勾配ステップを実行する実世界と、オプティマイザーが人口減少に対してステップを実行する理想世界を結合することです。これにより、テストエラーが次のように交互に分解されます。（1）理想的な世界のテストエラーと（2）2つの世界の間のギャップ。ギャップ（2）が普遍的に小さい場合、これはオフライン学習の一般化の問題をオンライン学習の最適化の問題に減らします。次に、現実的な深層学習の設定、特に教師あり画像の分類では、この世界間のギャップが小さい可能性があるという経験的証拠を示します。たとえば、CNNは、実世界での画像分布に関してMLPよりも一般化が優れていますが、これは、理想世界での人口減少をより迅速に最適化するためです。これは、私たちのフレームワークが深層学習の一般化を理解するための有用なツールであり、この分野での将来の研究の基礎を築くことを示唆しています。

We propose a new framework for reasoning about generalization in deep learning. The core idea is to couple the Real World, where optimizers take stochastic gradient steps on the empirical loss, to an Ideal World, where optimizers take steps on the population loss. This leads to an alternate decomposition of test error into: (1) the Ideal World test error plus (2) the gap between the two worlds. If the gap (2) is universally small, this reduces the problem of generalization in offline learning to the problem of optimization in online learning. We then give empirical evidence that this gap between worlds can be small in realistic deep learning settings, in particular supervised image classification. For example, CNNs generalize better than MLPs on image distributions in the Real World, but this is "because" they optimize faster on the population loss in the Ideal World. This suggests our framework is a useful tool for understanding generalization in deep learning, and lays a foundation for future research in the area.

updated: Fri Feb 19 2021 03:24:24 GMT+0000 (UTC)

published: Fri Oct 16 2020 03:07:49 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト