Loss Surface Simplexes for Mode Connecting Volumes and Fast Ensembling

Gregory W. Benton; Wesley J. Maddox; Sanae Lotfi; Andrew Gordon Wilson

モード接続ボリュームと高速アンサンブルのための損失面シンプレックス

多層ネットワークの損失面をよりよく理解することで、より堅牢で正確なトレーニング手順を構築できます。最近、独立してトレーニングされたSGDソリューションが、ほぼ一定のトレーニング損失の1次元パスに沿って接続できることが発見されました。この論文では、低損失の多次元多様体を形成し、多くの独立して訓練されたモデルを接続するモード接続複体があることを示します。この発見に触発されて、高速アンサンブルのための複体を効率的に構築する方法を示し、データセットシフトに対する精度、キャリブレーション、および堅牢性において、独立してトレーニングされたディープアンサンブルを上回ります。特に、私たちのアプローチは、事前にトレーニングされたソリューションから始めて、低損失のシンプレックスを発見するためにいくつかのトレーニングエポックを必要とするだけです。コードはhttps://github.com/g-benton/loss-surface-simplexesで入手できます。

With a better understanding of the loss surfaces for multilayer networks, we can build more robust and accurate training procedures. Recently it was discovered that independently trained SGD solutions can be connected along one-dimensional paths of near-constant training loss. In this paper, we show that there are mode-connecting simplicial complexes that form multi-dimensional manifolds of low loss, connecting many independently trained models. Inspired by this discovery, we show how to efficiently build simplicial complexes for fast ensembling, outperforming independently trained deep ensembles in accuracy, calibration, and robustness to dataset shift. Notably, our approach only requires a few training epochs to discover a low-loss simplex, starting from a pre-trained solution. Code is available at https://github.com/g-benton/loss-surface-simplexes.

updated: Thu Feb 25 2021 17:53:24 GMT+0000 (UTC)

published: Thu Feb 25 2021 17:53:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト