ZENG Kai, WAN Zi-xin, Wang Ming-tao, SHEN Tao
Online available: 2024-12-23
Restoring the weight distribution, activation distribution, and gradient to the original full precision network data as much as possible can greatly improve the inference ability of the binary network. However, existing methods directly apply the restoration operation in forward propagation to binary data, and the gradient approximation functions for backpropagation are fixed or manually determined, resulting in the need for improvement in the restoration efficiency of binary networks. To address this problem, the efficient restoration method is investigated for binary neural networks. Firstly, a distribution recovery method for maximizing information entropy is proposed. By shifting the original full precision weight mean and scaling the modulus, the quantized binary weight directly has the characteristic of maximum distribution restoration. At the same time, a simple statistical translation and scaling factor is used to greatly improve the restoration efficiency of weight and activation; Furthermore, it is proposed a gradient function based on adaptive distribution approximation, which dynamically determines the update range of the current gradient in the P-percentile according to the actual distribution of the current full precision data. It adaptively changes the shape of the approximation function to efficiently update the gradient during the training process, thereby improving the convergence ability of the model. On the premise of ensuring the improvement of execution efficiency, theoretical analysis has confirmed that the method proposed in this paper can achieve maximum restoration of binary data. Compared with the existing advanced binary network models, the experimental results of our method show excellent performance, with a 60% and 67% reduction in computational time for the distribution restoration operation quantization of ResNet-18 and ResNet-20, respectively. An accuracy of 93.0% was achieved for VGG-Small binary quantization on the CIFAR-10 dataset, and 61.9% was achieved for ResNet-18 binary quantization on the ImageNet dataset, both of which are the best performance of the current binary neural network. The relevant code is available inhttps://github.com/sjmp525/IA/tree/ER-BNN.