Sub-bit Neural Networks: Learning to Compress and Accelerate Binary Neural Networks

Yikai Wang; Yi Yang; Fuchun Sun; Anbang Yao

サブビットニューラルネットワーク：バイナリニューラルネットワークの圧縮と加速の学習

低ビット量子化の分野では、バイナリニューラルネットワーク（BNN）のトレーニングは、リソースに制約のあるデバイスでのディープモデルの展開を容易にするための極端なソリューションであり、32ビット浮動小数点に比べてストレージコストが最も低く、ビット単位の演算が大幅に安価です。 -ポイントカウンターパート。この論文では、BNNを圧縮および加速するように調整された新しいタイプのバイナリ量子化設計であるサブビットニューラルネットワーク（SNN）を紹介します。 SNNは経験的観察に触発されており、BNNモデルの畳み込み層で学習されたバイナリカーネルがカーネルサブセットに分散される可能性が高いことを示しています。その結果、重みを1つずつ2値化する既存の方法とは異なり、SNNは、きめ細かい畳み込みカーネル空間でのバイナリ量子化を活用するカーネル対応の最適化フレームワークでトレーニングされます。具体的には、私たちの方法には、カーネル空間のレイヤー固有のサブセットを生成するランダムサンプリングステップと、最適化を介してバイナリカーネルのこれらのサブセットを調整することを学習する改良ステップが含まれます。視覚認識ベンチマークの実験とFPGAでのハードウェア展開により、SNNの大きな可能性が検証されます。たとえば、ImageNetでは、0.56ビットの重みを持つResNet-18 / ResNet-34のSNNは、認識精度が中程度低下し、従来のBNNの3.13 /3.33倍の実行速度と1.8倍の圧縮を実現します。 SNNを適用して重みとアクティベーションの両方を2値化すると、有望な結果も得られます。私たちのコードはhttps://github.com/yikaiw/SNNで入手できます。

In the low-bit quantization field, training Binary Neural Networks (BNNs) is the extreme solution to ease the deployment of deep models on resource-constrained devices, having the lowest storage cost and significantly cheaper bit-wise operations compared to 32-bit floating-point counterparts. In this paper, we introduce Sub-bit Neural Networks (SNNs), a new type of binary quantization design tailored to compress and accelerate BNNs. SNNs are inspired by an empirical observation, showing that binary kernels learnt at convolutional layers of a BNN model are likely to be distributed over kernel subsets. As a result, unlike existing methods that binarize weights one by one, SNNs are trained with a kernel-aware optimization framework, which exploits binary quantization in the fine-grained convolutional kernel space. Specifically, our method includes a random sampling step generating layer-specific subsets of the kernel space, and a refinement step learning to adjust these subsets of binary kernels via optimization. Experiments on visual recognition benchmarks and the hardware deployment on FPGA validate the great potentials of SNNs. For instance, on ImageNet, SNNs of ResNet-18/ResNet-34 with 0.56-bit weights achieve 3.13/3.33 times runtime speed-up and 1.8 times compression over conventional BNNs with moderate drops in recognition accuracy. Promising results are also obtained when applying SNNs to binarize both weights and activations. Our code is available at https://github.com/yikaiw/SNN.

updated: Mon Oct 18 2021 11:30:29 GMT+0000 (UTC)

published: Mon Oct 18 2021 11:30:29 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト