Semi-Relaxed Quantization with DropBits: Training Low-Bit Neural Networks via Bit-wise Regularization

Jung Hyun Lee; Jihun Yun; Sung Ju Hwang; Eunho Yang

DropBitsによる半緩和量子化：ビット単位の正則化による低ビットニューラルネットワークのトレーニング

ネットワークの重みとアクティベーションのビット長を削減することを目的としたネットワーク量子化は、リソースが制限されたデバイスに展開するためのニューラルネットワークのサイズを削減するための重要な要素の1つとして浮上しています。連続的な活性化と重みを個別のものに変換する性質を克服するために、Relaxed Quantization（RQ）と呼ばれる最近の研究[Louizos etal。 2019]効率的な勾配ベースの最適化でこの変換を可能にする人気のあるGumbel-Softmaxの採用に成功しました。ただし、このガンベル-ソフトマックス緩和を使用したRQは、ガンベル-ソフトマックスの温度パラメーターに応じて、バイアスと分散のトレードオフに悩まされます。この問題を解決するために、マルチクラスのストレートスルー推定量を使用してバイアスと分散を効果的に低減する新しい方法であるSemi-Relaxed Quantization（SRQ）と、ドロップアウト正則化を置き換えてランダムにドロップする新しい正則化手法であるDropBitsを提案します。 SRQのマルチクラスストレートスルー推定器のバイアスをさらに減らすために、ニューロンの代わりにビット。 DropBitsの自然な拡張として、DropBitsを使用して各レイヤーの適切なビット長を見つけるために異種量子化レベルを学習する方法をさらに紹介します。さまざまなベンチマークデータセットとネットワークアーキテクチャでメソッドを実験的に検証し、量子化された宝くじの仮説もサポートします。異種の量子化レベルを学習することは、同じであるが固定された量子化レベルを最初から使用する場合よりも優れています。

Network quantization, which aims to reduce the bit-lengths of the network weights and activations, has emerged as one of the key ingredients to reduce the size of neural networks for their deployments to resource-limited devices. In order to overcome the nature of transforming continuous activations and weights to discrete ones, recent study called Relaxed Quantization (RQ) [Louizos et al. 2019] successfully employ the popular Gumbel-Softmax that allows this transformation with efficient gradient-based optimization. However, RQ with this Gumbel-Softmax relaxation still suffers from bias-variance trade-off depending on the temperature parameter of Gumbel-Softmax. To resolve the issue, we propose a novel method, Semi-Relaxed Quantization (SRQ) that uses multi-class straight-through estimator to effectively reduce the bias and variance, along with a new regularization technique, DropBits that replaces dropout regularization to randomly drop the bits instead of neurons to further reduce the bias of the multi-class straight-through estimator in SRQ. As a natural extension of DropBits, we further introduce the way of learning heterogeneous quantization levels to find proper bit-length for each layer using DropBits. We experimentally validate our method on various benchmark datasets and network architectures, and also support the quantized lottery ticket hypothesis: learning heterogeneous quantization levels outperforms the case using the same but fixed quantization levels from scratch.

updated: Tue Sep 07 2021 07:03:39 GMT+0000 (UTC)

published: Fri Nov 29 2019 07:58:43 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト