Moment Centralization based Gradient Descent Optimizers for Convolutional Neural Networks

Sumanth Sadu; Shiv Ram Dubey; SR Sreeja

畳み込みニューラルネットワーク用のモーメント集中化ベースの勾配降下オプティマイザ

畳み込みニューラルネットワーク（CNN）は、多くのコンピュータービジョンアプリケーションで非常に魅力的なパフォーマンスを示しています。 CNNのトレーニングは、一般に確率的勾配降下法（SGD）ベースの最適化手法を使用して実行されます。適応運動量ベースのSGDオプティマイザーは最近の傾向です。ただし、既存のオプティマイザーは、1次の瞬間にゼロ平均を維持できず、最適化に苦労します。この論文では、CNN用のモーメント集中化ベースのSGDオプティマイザを提案します。具体的には、1次モーメントにゼロ平均制約を明示的に課します。提案されたモーメント集中化は本質的に一般的であり、既存の適応運動量ベースのオプティマイザのいずれかと統合することができます。提案されたアイデアは、画像分類用のベンチマークCIFAR10、CIFAR100、およびTinyImageNetデータセットで、Adam、Radam、およびAdabeliefを含む3つの最先端の最適化手法でテストされます。既存のオプティマイザのパフォーマンスは、提案されたモーメント集中化と統合すると、一般的に改善されます。さらに、提案されたモーメント集中化の結果も、既存の勾配集中化よりも優れています。おもちゃの例を使用した分析分析は、提案された方法がより短く、より滑らかな最適化軌道につながることを示しています。ソースコードはhttps://github.com/sumanthsadhu/MC-optimizerで公開されています。

Convolutional neural networks (CNNs) have shown very appealing performance for many computer vision applications. The training of CNNs is generally performed using stochastic gradient descent (SGD) based optimization techniques. The adaptive momentum-based SGD optimizers are the recent trends. However, the existing optimizers are not able to maintain a zero mean in the first-order moment and struggle with optimization. In this paper, we propose a moment centralization-based SGD optimizer for CNNs. Specifically, we impose the zero mean constraints on the first-order moment explicitly. The proposed moment centralization is generic in nature and can be integrated with any of the existing adaptive momentum-based optimizers. The proposed idea is tested with three state-of-the-art optimization techniques, including Adam, Radam, and Adabelief on benchmark CIFAR10, CIFAR100, and TinyImageNet datasets for image classification. The performance of the existing optimizers is generally improved when integrated with the proposed moment centralization. Further, The results of the proposed moment centralization are also better than the existing gradient centralization. The analytical analysis using the toy example shows that the proposed method leads to a shorter and smoother optimization trajectory. The source code is made publicly available at https://github.com/sumanthsadhu/MC-optimizer.

updated: Tue Jul 19 2022 04:38:01 GMT+0000 (UTC)

published: Tue Jul 19 2022 04:38:01 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト