Convolutional Neural Networks with Dynamic Regularization

Yi Wang; Zhen-Peng Bian; Junhui Hou; Lap-Pui Chau

動的正則化を使用した畳み込みニューラルネットワーク

正則化は、機械学習の過剰適合を軽減するために一般的に使用されます。畳み込みニューラルネットワーク（CNN）の場合、DropBlockやShake-Shakeなどの正則化手法により、一般化パフォーマンスの向上が示されています。ただし、これらの方法は、トレーニング全体を通じて自己適応能力を欠いています。つまり、正則化の強度は事前定義されたスケジュールに固定されており、さまざまなネットワークアーキテクチャに適応するには手動で調整する必要があります。本論文では、CNNの動的正則化法を提案する。具体的には、トレーニング損失の関数として正則化強度をモデル化します。トレーニング損失の変化に応じて、私たちの方法はトレーニング手順の正則化強度を動的に調整し、それによってCNNの過適合と過適合のバランスをとることができます。動的正則化を使用すると、大規模モデルは強い摂動によって自動的に正則化され、その逆も同様です。実験結果は、提案された方法が既製のネットワークアーキテクチャの一般化機能を改善し、最先端の正則化方法よりも優れていることを示しています。

Regularization is commonly used for alleviating overfitting in machine learning. For convolutional neural networks (CNNs), regularization methods, such as DropBlock and Shake-Shake, have illustrated the improvement in the generalization performance. However, these methods lack a self-adaptive ability throughout training. That is, the regularization strength is fixed to a predefined schedule, and manual adjustments are required to adapt to various network architectures. In this paper, we propose a dynamic regularization method for CNNs. Specifically, we model the regularization strength as a function of the training loss. According to the change of the training loss, our method can dynamically adjust the regularization strength in the training procedure, thereby balancing the underfitting and overfitting of CNNs. With dynamic regularization, a large-scale model is automatically regularized by the strong perturbation, and vice versa. Experimental results show that the proposed method can improve the generalization capability on off-the-shelf network architectures and outperform state-of-the-art regularization methods.

updated: Thu Dec 31 2020 03:14:07 GMT+0000 (UTC)

published: Thu Sep 26 2019 03:06:49 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト