Revisiting Batch Normalization

Jim Davis; Logan Frank

バッチ正規化の再検討

バッチ正規化（BN）は、正規化コンポーネントとそれに続くアフィン変換で構成され、ディープニューラルネットワークのトレーニングに不可欠になっています。ネットワーク内の各BNの標準初期化により、アフィン変換スケールが設定され、それぞれ1と0にシフトされます。ただし、トレーニング後、これらのパラメーターは初期化からあまり変更されないことがわかりました。さらに、正規化プロセスでは依然として過度に大きな値が生成される可能性があることに気付きました。これはトレーニングには望ましくありません。 BNの定式化を再検討し、前述の問題に対処するためのBNの新しい初期化方法と更新アプローチを示します。提案されたBNの変更を使用した実験結果は、さまざまなシナリオで統計的に有意なパフォーマンスの向上を示しています。このアプローチは、追加の計算コストなしで既存の実装で使用できます。また、他のオフラインまたは固定の方法の必要性を軽減するために、新しいオンラインBNベースの入力データ正規化手法を紹介します。ソースコードはhttps://github.com/osu-cvl/revisiting-bnで入手できます。

Batch normalization (BN) is comprised of a normalization component followed by an affine transformation and has become essential for training deep neural networks. Standard initialization of each BN in a network sets the affine transformation scale and shift to 1 and 0, respectively. However, after training we have observed that these parameters do not alter much from their initialization. Furthermore, we have noticed that the normalization process can still yield overly large values, which is undesirable for training. We revisit the BN formulation and present a new initialization method and update approach for BN to address the aforementioned issues. Experimental results using the proposed alterations to BN show statistically significant performance gains in a variety of scenarios. The approach can be used with existing implementations at no additional computational cost. We also present a new online BN-based input data normalization technique to alleviate the need for other offline or fixed methods. Source code is available at https://github.com/osu-cvl/revisiting-bn.

updated: Tue Oct 26 2021 19:48:19 GMT+0000 (UTC)

published: Tue Oct 26 2021 19:48:19 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト