Statistically Significant Stopping of Neural Network Training

Justin K. Terry; Mario Jayakumar; Kusal De Alwis

ニューラルネットワークトレーニングの統計的に有意な停止

深層学習分類器をトレーニングするときに採用される一般的なアプローチは、数回の反復ごとにパラメーターを保存し、人間のオブザーバーまたは単純なメトリックベースのヒューリスティックがネットワークが学習していないと判断するまでトレーニングし、保存されたパラメーターをバックトラックして選択することです。最高の検証精度。最適値が見つかった後であれば、条件がモデルの最終的な精度に影響を与えないため、ニューラルネットワークが学習していないかどうかを判断するために簡単な方法が使用されます。ただし、実行時の観点からは、これは、多数のニューラルネットワークが同時にトレーニングされる多くの場合（ハイパーパラメータ調整など）にとって非常に重要です。これに動機付けられて、ニューラルネットワークが学習を停止したかどうかを判断するための統計的有意性検定を導入します。この停止基準は、他の一般的な停止基準と比較して満足のいく媒体を表しているように見え、77％以下のエポックで最高の最終精度を達成する基準と同等の精度を達成しますが、より早く停止する基準は、最終精度にかなりの損失をもたらします。。さらに、これを新しい学習率スケジューラーの基礎として使用し、学習率スケジュールを手動で選択する必要をなくし、準直線探索として機能して、既存の方法よりも優れた、または同等の経験的パフォーマンスを実現します。

The general approach taken when training deep learning classifiers is to save the parameters after every few iterations, train until either a human observer or a simple metric-based heuristic decides the network isn't learning anymore, and then backtrack and pick the saved parameters with the best validation accuracy. Simple methods are used to determine if a neural network isn't learning anymore because, as long as it's well after the optimal values are found, the condition doesn't impact the final accuracy of the model. However from a runtime perspective, this is of great significance to the many cases where numerous neural networks are trained simultaneously (e.g. hyper-parameter tuning). Motivated by this, we introduce a statistical significance test to determine if a neural network has stopped learning. This stopping criterion appears to represent a happy medium compared to other popular stopping criterions, achieving comparable accuracy to the criterions that achieve the highest final accuracies in 77% or fewer epochs, while the criterions which stop sooner do so with an appreciable loss to final accuracy. Additionally, we use this as the basis of a new learning rate scheduler, removing the need to manually choose learning rate schedules and acting as a quasi-line search, achieving superior or comparable empirical performance to existing methods.

updated: Mon Mar 01 2021 18:51:16 GMT+0000 (UTC)

published: Mon Mar 01 2021 18:51:16 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト