Font Size: a A A

Researchs On Compression Algorithms Of Convolutional Neural Networks

Posted on:2020-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2518305771956079Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Recently,deep learning model has achieved excellent results in image classification,natural language processing and other applications.With the improvement of performance,the structure of deep learning model is becoming more and more complex,and the consumption of storage and computation is increasing.In order to reduce the complexity of the model,researchers have mainly focused on three categories of methods:quantification algorithms,pruning,and lightweight models(such as SqueezeNet).Deep learning quantization algorithms usually replace the parameters in convolution layers with the certain values(in the form of power of 2),so that floating point multiplication in convolution operation can be replaced with bit operation on specific hardware.Quantization algorithm mostly relies on prior knowledge(hyper-parameters)to determine the target quantization value,so the set of target value found in this way will have redundancy and space for further compression.In existing quantization algorithms,the whole network or convolution layer usually share the same set of target quantization values.However,the range of parameters in the quantization algorithm is limited,and the performance of the quantized model is affected by sharing a small set of target values in the whole network.Researches of pruning algorithms have turned from pruning model parameters to pruning models' channel(filters).After pruning,the amount of model parameters is reduced,so the storage space of the model and the computation resources needed in application are greatly reduced.The filters pruned model can run directly on the existing deep learning libraries.Existing pruning algorithms have a general process:pre-training,iterative pruning.Pre-training takes up a long training epochs,but existing pruning algorithms still can not make a good pruning decision.This thesis respectively focuses on the above-mentioned issues.The main contributions of the paper listed are as follow:First,to determine the set of target quantization values according to the distribution of parameters.Existing quantization algorithms usually determine the set of target values based on experience or prior knowledge.The clustering algorithm is firstly used to generate the set of target quantization values corresponding to the distribution of each convolution layer's parameters.In the experiments,the same quantization strategy is employed to compare the validity of the target quantization values based on parameter distribution and experience.The experimental results show that the set of target quantization values based on parameter distribution can quantize the whole network with fewer values,and further compress the models' storage.Second,the method presented by us determines the set of target quantization values based on different filters.Existing quantization algorithms seldom consider the range of target quantization values,and the whole network or convolution layer uses the same set of target quantization values.The method is proposed to change the application range of target values' set and generate target values' set for different filters.In experiments,the same quantization strategy is employed to compare the validity of the improved set of quantized objective values generated according to each filter with the original set of target quantization values.The experimental results show that with the set of target quantization values generated according to each filter,the network has the good performance with higher accuracy.Third,We propose a novel thought:Incremental Pruning Thought Based on On Less Training(IPLT).This idea can greatly optimize the computational consumption during training CNNs.Existing pruning strategies always perform pruning based on a large number of pre-training epochs,but the pruning operations are not optimal.We propose IPLT and use it to improve two existing pruning algorithms.The two existing pruning algorithms are proposed to get pruned models based on a small amount of pretraining,and the models' performance is almost as good as that based on a large number of pre-training.
Keywords/Search Tags:Quantization, Distribution of Parameters, Clustering, Filters, less pre-training, Pruning
PDF Full Text Request
Related items