Delving into the Estimation Shift of Batch Normalization in a Network

Lei Huang; Yi Zhou; Tian Wang; Jie Luo; Xianglong Liu

ネットワークにおけるバッチ正規化の推定シフトを掘り下げる

バッチ正規化（BN）は、深層学習における画期的な手法です。トレーニング中はミニバッチ統計を使用してアクティベーションを正規化しますが、推論中は推定母集団統計を使用します。この論文は、人口統計の推定の調査に焦点を合わせています。 BNの推定シフトの大きさを定義して、推定された母集団統計と予想される母集団統計の差を定量的に測定します。私たちの主な観察結果は、ネットワーク内のBNのスタックが原因で推定シフトが累積される可能性があり、これがテストパフォーマンスに悪影響を与えることです。さらに、バッチフリー正規化（BFN）は、このような推定シフトの蓄積をブロックできることがわかります。これらの観察結果は、残差スタイルネットワークのボトルネックブロックで1つのBNをBFNに置き換えるXBNBlockの設計を動機付けます。 ImageNetおよびCOCOベンチマークでの実験は、XBNBlockがResNetおよびResNeXtを含むさまざまなアーキテクチャのパフォーマンスを一貫して大幅に改善し、ディストリビューションシフトに対してより堅牢であるように見えることを示しています。

Batch normalization (BN) is a milestone technique in deep learning. It normalizes the activation using mini-batch statistics during training but the estimated population statistics during inference. This paper focuses on investigating the estimation of population statistics. We define the estimation shift magnitude of BN to quantitatively measure the difference between its estimated population statistics and expected ones. Our primary observation is that the estimation shift can be accumulated due to the stack of BN in a network, which has detriment effects for the test performance. We further find a batch-free normalization (BFN) can block such an accumulation of estimation shift. These observations motivate our design of XBNBlock that replace one BN with BFN in the bottleneck block of residual-style networks. Experiments on the ImageNet and COCO benchmarks show that XBNBlock consistently improves the performance of different architectures, including ResNet and ResNeXt, by a significant margin and seems to be more robust to distribution shift.

updated: Mon Mar 21 2022 07:49:14 GMT+0000 (UTC)

published: Mon Mar 21 2022 07:49:14 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト