Diversifying Sample Generation for Accurate Data-Free Quantization

Xiangguo Zhang; Haotong Qin; Yifu Ding; Ruihao Gong; Qinghua Yan; Renshuai Tao; Yuhang Li; Fengwei Yu; Xianglong Liu

正確なデータフリー量子化のための多様化するサンプル生成

量子化は、ニューラルネットワークを圧縮および加速するための最も一般的なアプローチの1つとして浮上しています。最近、データフリー量子化が実用的で有望なソリューションとして広く研究されています。 FP32モデルのバッチ正規化（BN）統計に従って量子化モデルを較正するためのデータを合成し、従来の量子化方法における実際のトレーニングデータへの大きな依存を大幅に軽減します。残念ながら、実際には、BN統計によって同じように制約された合成データは、分布レベルとサンプルレベルの両方で深刻な均質化を被り、さらに量子化モデルのパフォーマンスが大幅に低下することがわかりました。均質化によって引き起こされる悪影響を軽減するために、多様なサンプル生成（DSG）スキームを提案します。具体的には、BNレイヤーの特徴統計の調整を緩めて、分布レベルでの制約を緩和し、さまざまなデータサンプルの特定のレイヤーを強化するレイヤーごとの拡張を設計します。当社のDSGスキームは用途が広く、AdaRoundのような最先端のトレーニング後の量子化手法にも適用できます。大規模な画像分類タスクでDSGスキームを評価し、特に下位ビットに量子化した場合に、さまざまなネットワークアーキテクチャと量子化方法に対して一貫して大幅な改善を実現します（たとえば、W4A4で最大22％の改善）。さらに、強化された多様性の恩恵を受けて、合成データによって調整されたモデルは、実際のデータによって調整されたモデルに近く、W4A4でのモデルよりも優れています。

Quantization has emerged as one of the most prevalent approaches to compress and accelerate neural networks. Recently, data-free quantization has been widely studied as a practical and promising solution. It synthesizes data for calibrating the quantized model according to the batch normalization (BN) statistics of FP32 ones and significantly relieves the heavy dependency on real training data in traditional quantization methods. Unfortunately, we find that in practice, the synthetic data identically constrained by BN statistics suffers serious homogenization at both distribution level and sample level and further causes a significant performance drop of the quantized model. We propose Diverse Sample Generation (DSG) scheme to mitigate the adverse effects caused by homogenization. Specifically, we slack the alignment of feature statistics in the BN layer to relax the constraint at the distribution level and design a layerwise enhancement to reinforce specific layers for different data samples. Our DSG scheme is versatile and even able to be applied to the state-of-the-art post-training quantization method like AdaRound. We evaluate the DSG scheme on the large-scale image classification task and consistently obtain significant improvements over various network architectures and quantization methods, especially when quantized to lower bits (e.g., up to 22% improvement on W4A4). Moreover, benefiting from the enhanced diversity, models calibrated by synthetic data perform close to those calibrated by real data and even outperform them on W4A4.

updated: Mon Mar 01 2021 14:46:02 GMT+0000 (UTC)

published: Mon Mar 01 2021 14:46:02 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト