Research On Binary Quantization Method Of Deep Neural Network

Posted on:2024-01-14

Degree:Master

Type:Thesis

Country:China

Candidate:X S Zhao

Full Text:PDF

GTID:2568306944470564

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Deep neural networks typically have a large number of parameters and require extensive computation during the inference process,which poses significant challenges for hardware deployment platforms in terms of storage capacity and computational power.To leverage the advantages of deep neural networks on resource-constrained mobile and embedded devices,it is usually necessary to perform lightweight optimization on large deep neural networks.Binary quantization,which directly quantizes the weights and activations into 1-bit(+1 or-1)representations,has emerged as a remarkable technique for deep neural network lightweighting.It significantly reduces the number of parameters and computational costs involved in networks,making it an attention-worthy approach.However,current binary neural networks still suffer from significant accuracy degradation.In order to further enhance the precision of binary neural networks,this paper deeply analyzes the working principles and training processes of binary neural networks.It optimizes the basic structure of binary neural networks and proposes training optimization methods such as weight constraint and weight transition.The specific research work is as follows:1.The basic structure of binary neural networks has a significant impact on the deep neural network’s accuracy.This paper analyzes the commonly used basic structures in binary neural networks and proposes improvements to address issues such as structural redundancy,vanishing gradients,and weak representation ability in existing binary neural networks.Specifically,it includes deleting the unnecessary feature map scaling operation,proposing the Repeat channel dimension enhancement method,using PReLU as the activation function,and adding a Batch Normalization layer at the shortcut connection.Additionally,this paper analyzes the role of the learning rate decay method in the binary neural networks training process and compares the advantages and disadvantages of the learning rate decay method of multi-step decay and cosine annealing.The optimized network structure shows a significant improvement in accuracy compared to the original network structure.On the CIFAR-10 dataset,the optimized ResNet-20 network has increased the accuracy by 2%compared with the unoptimized network.When training the optimized binary neural network structure,this paper uses a variety of learning rate decay methods for experiments.The results show that the learning rate decay method of cosine annealing has obvious advantages over other learning rate decay methods in binary neural networks.2.During the training of the binary neural network,the difficulty of updating weights and the frequent flipping of weights coexist and may be converted to each other,which are two important reasons for the loss of network accuracy.In this paper,a weight constraint method is proposed to address the issue of weight update difficulty.The weight constraint method is to constrain the full-precision proxy value Wr within a carefully calculated threshold range,so as to improve the ability of weight update.A weight transition method is proposed to solve the problem of frequent weight flipping.When the sign of the full-precision proxy value WT changes,it is reset to sth,where sth is a value that varies with the learning rate,and the weight transition method can significantly improve the stability and convergence speed of the training process,Experimental results have demonstrated that both the weight constraint and weight transition methods possess strong generality.Applying it to other binary neural network optimization methods can also improve the convergence speed and accuracy of the network without any memory and computing power overhead.The method in this article achieved 88.9%Top-1 accuracy on the CIFAR-10 dataset with the ResNet-20 network,which is 0.7%higher than AdaBin.On the ImageNet dataset,the ResNet-18 network achieves 63.6%Top-1 accuracy,which is 0.5%higher than AdaBin.

Keywords/Search Tags:

deep learning, network quantization, binary neural network, model compression

PDF Full Text Request

Related items

1	Research On Binary Quantization Methods Of Deep Learning Models
2	Research On Model Compression Method Of Deep Convolution Neural Network
3	Research On Image Compression Algorithm Based On Deep Learning
4	Study Of Low Bit-width Quantization Of Deep Convolutional Neural Network
5	Research On Model Compression And Acceleration For Deep Neural Network
6	Research And Improvement Of Model Compression Method Based On Deep Convolutional Neural Network
7	On The Learning And Compression Of Deep Neural Network Structure
8	Research On Deep Learning Model Quantization And Related Compression Technologies
9	Optimization And Application Of Deep Learning Network For Dedicated NPU
10	Deep Neural Network Compression Method Based On Product Quantization