Angle based dynamic learning rate for gradient descent

Neel Mishra; Pawan Kumar

勾配降下の角度ベースの動的学習率

私たちの仕事では、分類タスクの勾配ベースの降下法の適応学習率を取得するための、斬新でシンプルなアプローチを提案します。勾配ベースの項の期待値の減衰によって適応学習率を選択する従来のアプローチの代わりに、現在の勾配と新しい勾配の間の角度を使用します。この新しい勾配は、現在の勾配に直交する方向から計算されるため、さらに役立ちます。角度履歴に基づいてより良い適応学習率を決定することで、既存の最先端のオプティマイザーと比較して比較的高い精度につながります。 ResNet、DenseNet、EfficientNet、VGG などの著名な画像分類アーキテクチャを備えたさまざまなベンチマークデータセットで、ほとんどのデータセットでこの方法が最高の精度をもたらすことがわかりました。さらに、この方法が収束することを証明します。

In our work, we propose a novel yet simple approach to obtain an adaptive learning rate for gradient-based descent methods on classification tasks. Instead of the traditional approach of selecting adaptive learning rates via the decayed expectation of gradient-based terms, we use the angle between the current gradient and the new gradient: this new gradient is computed from the direction orthogonal to the current gradient, which further helps us in determining a better adaptive learning rate based on angle history, thereby, leading to relatively better accuracy compared to the existing state-of-the-art optimizers. On a wide variety of benchmark datasets with prominent image classification architectures such as ResNet, DenseNet, EfficientNet, and VGG, we find that our method leads to the highest accuracy in most of the datasets. Moreover, we prove that our method is convergent.

updated: Thu Apr 20 2023 16:55:56 GMT+0000 (UTC)

published: Thu Apr 20 2023 16:55:56 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト