Adaptive Binary-Ternary Quantization

Ryan Razani; Grégoire Morin; Vahid Partovi Nia; Eyyüb Sari

アダプティブバイナリ-ターナリ量子化

ニューラルネットワークモデルはリソースを大量に消費します。スマートウェアラブル、携帯電話、ドローン、自動運転車など、リソースが限られているデバイスにこのような深いネットワークを展開することは困難です。バイナリおよびターナリ量子化などの低ビット量子化は、このリソース要件を軽減するための一般的なアプローチです。三元量子化は、より柔軟なモデルを提供し、精度の点で二進量子化よりも優れていますが、メモリフットプリントが2倍になり、計算コストが増加します。これらのアプローチとは対照的に、混合量子化モデルでは、精度とメモリフットプリントの間のトレードオフが可能です。このようなモデルでは、量子化の深さは手動で選択されるか、別の最適化ルーチンを使用して調整されることがよくあります。後者では、量子化されたネットワークを複数回トレーニングする必要があります。ここでは、2値量子化と3値量子化の適応的な組み合わせ、つまり、モデルが1回だけトレーニングされるように、量子化の深さが正則化関数を介して直接変更されるスマート量子化（SQ）を提案します。私たちの実験結果は、提案された方法が、MNISTおよびCIFAR10ベンチマークでモデルの精度を高く保ちながら、量子化の深さをうまく適応させることを示しています。

Neural network models are resource hungry. It is difficult to deploy such deep networks on devices with limited resources, like smart wearables, cellphones, drones, and autonomous vehicles. Low bit quantization such as binary and ternary quantization is a common approach to alleviate this resource requirements. Ternary quantization provides a more flexible model and outperforms binary quantization in terms of accuracy, however doubles the memory footprint and increases the computational cost. Contrary to these approaches, mixed quantized models allow a trade-off between accuracy and memory footprint. In such models, quantization depth is often chosen manually, or is tuned using a separate optimization routine. The latter requires training a quantized network multiple times. Here, we propose an adaptive combination of binary and ternary quantization, namely Smart Quantization (SQ), in which the quantization depth is modified directly via a regularization function, so that the model is trained only once. Our experimental results show that the proposed method adapts quantization depth successfully while keeping the model accuracy high on MNIST and CIFAR10 benchmarks.

updated: Mon Sep 13 2021 18:28:56 GMT+0000 (UTC)

published: Thu Sep 26 2019 15:49:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト