K-SAM: Sharpness-Aware Minimization at the Speed of SGD

Renkun Ni; Ping-yeh Chiang; Jonas Geiping; Micah Goldblum; Andrew Gordon Wilson; Tom Goldstein

K-SAM: SGD の速度でのシャープネスを考慮した最小化

Sharpness-Aware Minimization (SAM) は、ディープニューラルネットワークの精度を向上させるための堅牢な手法として最近登場しました。ただし、SAM は実際には高い計算コストが発生し、通常の SGD の最大 2 倍の計算が必要になります。 SAM による計算上の課題が生じるのは、反復ごとに上昇ステップと下降ステップの両方が必要になり、勾配計算が 2 倍になるためです。この課題に対処するために、SAM の両方の段階で、損失が最も大きい上位 k 個のサンプルのみで勾配を計算することを提案します。 K-SAM はシンプルで実装が非常に簡単であり、追加コストがほとんどまたはまったくなく、通常の SGD よりも大幅な一般化の向上を提供します。

Sharpness-Aware Minimization (SAM) has recently emerged as a robust technique for improving the accuracy of deep neural networks. However, SAM incurs a high computational cost in practice, requiring up to twice as much computation as vanilla SGD. The computational challenge posed by SAM arises because each iteration requires both ascent and descent steps and thus double the gradient computations. To address this challenge, we propose to compute gradients in both stages of SAM on only the top-k samples with highest loss. K-SAM is simple and extremely easy-to-implement while providing significant generalization boosts over vanilla SGD at little to no additional cost.

updated: Sun Oct 23 2022 21:49:58 GMT+0000 (UTC)

published: Sun Oct 23 2022 21:49:58 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト