Universal Adder Neural Networks

Hanting Chen; Yunhe Wang; Chang Xu; Chao Xu; Chunjing Xu; Tong Zhang

ユニバーサル加算器ニューラルネットワーク

安価な加算演算と比較して、乗算演算は計算の複雑さがはるかに高くなります。ディープニューラルネットワークで広く使用されている畳み込みは、入力特徴と畳み込みフィルターの間の類似性を測定するための正確な相互相関であり、フロート値間の大規模な乗算が含まれます。このホワイトペーパーでは、ディープニューラルネットワーク、特に畳み込みニューラルネットワーク（CNN）でこれらの大規模な乗算を交換して、計算コストを削減するためのはるかに安価な加算を行う加算器ネットワーク（AdderNets）を紹介します。 AdderNetsでは、フィルターと入力機能の間のℓ_1ノルム距離を出力応答として使用します。最初に、単一の隠れ層AdderNetとReLU活性化関数を備えた幅境界の深いAdderNetの両方が普遍関数近似器であることを示すことにより、AdderNetsの理論的基盤を開発します。単一の隠れ層を持つAdderNetsの近似限界も示されています。この新しい類似性尺度がニューラルネットワークの最適化に与える影響をさらに分析し、AdderNets用の特別なトレーニングスキームを開発します。勾配の大きさに基づいて、AdderNetsのトレーニング手順を強化するための適応学習率戦略が提案されています。 AdderNetsは、畳み込み層で乗算することなく、ImageNetデータセットでResNet-50を使用して75.7％のTop-1精度と92.3％のTop-5精度を達成できます。

Compared with cheap addition operation, multiplication operation is of much higher computation complexity. The widely-used convolutions in deep neural networks are exactly cross-correlation to measure the similarity between input feature and convolution filters, which involves massive multiplications between float values. In this paper, we present adder networks (AdderNets) to trade these massive multiplications in deep neural networks, especially convolutional neural networks (CNNs), for much cheaper additions to reduce computation costs. In AdderNets, we take the ℓ_1-norm distance between filters and input feature as the output response. We first develop a theoretical foundation for AdderNets, by showing that both the single hidden layer AdderNet and the width-bounded deep AdderNet with ReLU activation functions are universal function approximators. An approximation bound for AdderNets with a single hidden layer is also presented. We further analyze the influence of this new similarity measure on the optimization of neural network and develop a special training scheme for AdderNets. Based on the gradient magnitude, an adaptive learning rate strategy is proposed to enhance the training procedure of AdderNets. AdderNets can achieve a 75.7% Top-1 accuracy and a 92.3% Top-5 accuracy using ResNet-50 on the ImageNet dataset without any multiplication in the convolutional layer.

updated: Tue Jun 29 2021 09:52:33 GMT+0000 (UTC)

published: Sat May 29 2021 04:02:51 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト