Random Bias Initialization Improves Quantized Training
Binary neural networks improve computationally efficiency of deep models with a large margin. However, there is still a performance gap between a successful full-precision training and binary training. We bring some insights about why this accuracy drop exists and call for a better understanding of binary network geometry. We start with analyzing full-precision neural networks with ReLU activation and compare it with its binarized version. This comparison suggests to initialize networks with random bias, a counter-intuitive remedy.
updated: Mon Apr 20 2020 19:50:23 GMT+0000 (UTC)
published: Mon Sep 30 2019 04:01:13 GMT+0000 (UTC)
