In-Hindsight Quantization Range Estimation for Quantized Training

Marios Fournarakis; Markus Nagel

量子化されたトレーニングのためのインインインサイト量子化範囲の推定

ディープニューラルネットワークの推論に適用される量子化技術により、リソースに制約のあるデバイスでの高速かつ効率的な実行が可能になりました。推論中の量子化の成功により、学界は完全に量子化されたトレーニング、つまりバックプロパゲーションの量子化も模索するようになりました。ただし、効果的な勾配量子化は未解決の問題です。勾配には制限がなく、トレーニング中にその分布が大幅に変化するため、動的な量子化が必要になります。示されているように、動的量子化は、大幅なメモリオーバーヘッドと、トレーニングを遅くする追加のデータトラフィックにつながる可能性があります。以前の反復で推定された量子化範囲を使用して現在を量子化する、動的量子化の単純な代替手段である後知恵範囲推定を提案します。私たちのアプローチは、オンラインで出力統計を追跡するためにニューラルネットワークアクセラレータからの最小限のハードウェアサポートのみを必要としながら、勾配とアクティベーションの高速静的量子化を可能にします。これは、量子化範囲を推定するためのドロップイン代替として意図されており、量子化トレーニングの他の進歩と組み合わせて使用できます。量子化されたトレーニング文献からの範囲推定のための既存の方法と私たちの方法を比較し、画像分類ベンチマーク（Tiny ImageNet＆ImageNet）で、MobileNetV2を含むさまざまなアーキテクチャでその有効性を示します。

Quantization techniques applied to the inference of deep neural networks have enabled fast and efficient execution on resource-constraint devices. The success of quantization during inference has motivated the academic community to explore fully quantized training, i.e. quantizing back-propagation as well. However, effective gradient quantization is still an open problem. Gradients are unbounded and their distribution changes significantly during training, which leads to the need for dynamic quantization. As we show, dynamic quantization can lead to significant memory overhead and additional data traffic slowing down training. We propose a simple alternative to dynamic quantization, in-hindsight range estimation, that uses the quantization ranges estimated on previous iterations to quantize the present. Our approach enables fast static quantization of gradients and activations while requiring only minimal hardware support from the neural network accelerator to keep track of output statistics in an online fashion. It is intended as a drop-in replacement for estimating quantization ranges and can be used in conjunction with other advances in quantized training. We compare our method to existing methods for range estimation from the quantized training literature and demonstrate its effectiveness with a range of architectures, including MobileNetV2, on image classification benchmarks (Tiny ImageNet & ImageNet).

updated: Mon May 10 2021 10:25:28 GMT+0000 (UTC)

published: Mon May 10 2021 10:25:28 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト