Font Size: a A A

Research On Binary Quantization Methods Of Deep Learning Models

Posted on:2022-06-10Degree:MasterType:Thesis
Country:ChinaCandidate:S ZhangFull Text:PDF
GTID:2518306536463314Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,deep learning has made breakthrough progress in many fields.However,to cope with increasingly complex artificial intelligence applications,the computational and storage complexity of deep learning models has gradually increased,and the demand for computing resources and power consumption has increased significantly.As a result,deep learning programs currently applied to embedded devices and other devices with insufficient resources usually request computing services from the server(cloud).However,this cloud-dependent model has challenges such as data transmission costs,latency and stability,especially privacy concerns.Deploying the computing of the deep learning model to embedded devices has become the main solution.Therefore,how to compress and accelerate deep learning models has become a research hotspot.Among them,binary neural networks have attracted more attention due to its extremely low parameter storage and high energy-efficient bit operation,and has made certain research progress.However,there are still some challenges with binary neural networks.On the one hand,the extreme binary quantization makes the representation ability and accuracy of the deep neural network decrease.On the other hand,the current binary neural networks still have the possibility for further compression.In response to the above-mentioned problems,this paper took the image classification task as application,and proposed two corresponding methods.In terms of accuracy,a new adaptive quantization method was proposed to reduce the loss of network accuracy;for the complexity problem,a compact binary neural network was proposed,which can further reduce the computing requirements of the network without losing the accuracy.The main contributions of this paper are as follows:An adaptive and learnable binary quantization method was proposed.Aiming at the shortcomings of current binary quantization methods,a neural encoder with parameters was proposed to replace the traditional sign function for quantization of neural network weight parameters.This encoder can learn a better quantization strategy during training process.According to the correlation of the elements in the weight channel,two neural encoders were designed in this paper.Inspired by the idea of knowledge distillation,this paper introduced a distribution loss to help the training of binary neural networks.By minimizing the final loss,the binary neural network and neural encoder can be trained and updated in an end-to-end way.Finally,a series of experiments showed that the proposed method can effectively reduce the accuracy drop.On the ImageNet dataset,the accuracy was improved by 1.9% compared to the baseline network,and the accuracy was improved by 1.4% compared to the latest research.Meanwhile,this method can improve the stability and accelerate the training of the binary neural network.This paper proposed a compact binary neural network(MicroNet).In view of the current computational complexity of binary neural networks,a lightweight convolution structure is introduced,the problems of direct binary quantization are analyzed,and an information-enhanced binary convolution module is designed,which includes adding bypass residual connection,and the use of group convolution instead of deepwise convolution.As a result,the information loss caused by binary quantization was reduced,and by controlling the number of groups,a balance between accuracy and computational complexity was obtained.Finally,according to the information-enhanced module,a compact binary neural network(MicroNet)was constructed.Experiments showed that,compared with traditional binary neural networks,MicroNet can achieve lower computational complexity with similar accuracy.
Keywords/Search Tags:Deep Learning, Deep Neural Network Compression, Lightweight Neural Network, Adaptive Quantization Method
PDF Full Text Request
Related items