MixBin: Towards Budgeted Binarization

Udbhav Bamba; Neeraj Anand; Dilip K. Prasad; Deepak K. Gupta

MixBin: 予算に合わせた二値化に向けて

二値化は、ニューラルネットワーク圧縮の最も効果的な方法の 1 つであることが証明されており、元のモデルの FLOP を大幅に削減します。ただし、このようなレベルの圧縮では、多くの場合、パフォーマンスが大幅に低下します。ネットワークの部分的な 2 値化を容易にすることで、このパフォーマンスの低下を軽減するいくつかのアプローチが存在しますが、単一のネットワークでバイナリパラメーターと完全精度パラメーターを混在させる体系的なアプローチはまだありません。この論文では、制御された意味でニューラルネットワークの部分的な 2 値化を実行するパラダイムを提案し、それによってバジェットバイナリニューラルネットワーク (B2NN) を構築します。バイナリおよび完全精度コンポーネントの最適化された混合により B2NN を構築する反復検索ベースの戦略である MixBin を提示します。 MixBin を使用すると、バイナリとして保持するネットワークのおおよその部分を明示的に選択できるため、所定の予算で推論コストを適応させる柔軟性が得られます。 MixBin 戦略から得られた B2NN は、ネットワーク層のランダムな選択から得られたものよりもはるかに優れていることを実験を通じて示しています。効果的な方法で部分 2 値化を実行するには、B2NN の完全精度とバイナリコンポーネントの両方が適切に最適化されていることが重要です。また、活性化関数の選択がこのプロセスに大きな影響を与える可能性があることを示し、この問題を回避するために、BinReLU を提示します。これは、完全精度およびバイナリコンポーネントの効果的な活性化関数として使用できます。任意の B2NN。実験的調査により、B2NN の考えられるすべてのシナリオ (ゼロ、部分、および完全な 2 値化) で、BinReLU が他の活性化関数よりも優れていることが明らかになりました。最後に、ベンチマークデータセットを使用して、分類とオブジェクト追跡のタスクに対する MixBin の有効性を示します。

Binarization has proven to be amongst the most effective ways of neural network compression, reducing the FLOPs of the original model by a large extent. However, such levels of compression are often accompanied by a significant drop in the performance. There exist some approaches that reduce this performance drop by facilitating partial binarization of the network, however, a systematic approach to mix binary and full-precision parameters in a single network is still missing. In this paper, we propose a paradigm to perform partial binarization of neural networks in a controlled sense, thereby constructing budgeted binary neural network (B2NN). We present MixBin, an iterative search-based strategy that constructs B2NN through optimized mixing of the binary and full-precision components. MixBin allows to explicitly choose the approximate fraction of the network to be kept as binary, thereby presenting the flexibility to adapt the inference cost at a prescribed budget. We demonstrate through experiments that B2NNs obtained from our MixBin strategy are significantly better than those obtained from random selection of the network layers. To perform partial binarization in an effective manner, it is important that both the full-precision as well as the binary components of the B2NN are appropriately optimized. We also demonstrate that the choice of the activation function can have a significant effect on this process, and to circumvent this issue, we present BinReLU, that can be used as an effective activation function for the full-precision as well as the binary components of any B2NN. Experimental investigations reveal that BinReLU outperforms the other activation functions in all possible scenarios of B2NN: zero-, partial- as well as full binarization. Finally, we demonstrate the efficacy of MixBin on the tasks of classification and object tracking using benchmark datasets.

updated: Sat Nov 12 2022 20:30:38 GMT+0000 (UTC)

published: Sat Nov 12 2022 20:30:38 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト