Font Size: a A A

Research On Quantification Pruning And Method Of Convolutional Neural Network

Posted on:2020-09-02Degree:MasterType:Thesis
Country:ChinaCandidate:J MiaoFull Text:PDF
GTID:2428330590474316Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
With the development of artificial intelligence,deep learning networks are widely empolyed with various artificial intelligence integration solutions.As deep learning networks become deeper and deeper,learning networks become more complex and redundant.How to improve the hardware power efficiency with neural networks has become a hot research topic in deep learning.GPU have a big advantages over CPUs in terms of multitasking parallel processing because of their multi-stream processing unit modules,although this advantage is often built on the big amount computing units and storage units.But in embedded devices,the overhead of computing units and storage units is often limited,especially for storage units with large energy consumption,such as DRAM.Therefore,the corresponding memory and computational overheads are optimized is essetial for algorthm deploy.The use of pruning and quantization operations greatly reduced the memory requirements of deep neural networks in embedded devices.The improvement of this topic is as follows:The major flaw in the traditional pruning operation.After the network pruning,it is impossible to evaluate the impact of the pruned network on the overall connection.It is necessary to suppress the impact of the pruning operation on the overall recognition rate by suppressing the compression ratio.Will result in inefficient pruning.In view of the defect of pruning operation,this paper introduces Yiwen Guo's work in pruning,and introduces the splicing operation in the pruning algorithm to recover the network that has been clipped.In the experiment,the Lenet-5 network of the classic convolutional neural network achieved 38.4 times compression,and the depth of the Alexnet network also achieved 12.7 times compression.The problem in the traditional clustering algorithm that the cluster center will shift during the retraining process.In this thesis,the clustering algorithm is improved.Then the probability center is used to quantify the weights,and the error accumulation of weights in the retraining process is realized,which improves the quantization efficiency.And the whole training is grouped according to the research in the field of quantification,so that the effect of the overall convergence is enhanced.In the experiment,the result of the 0.12% error on the whole achieved a 31-fold quantization effect on the Alexnet network,and the Lex-5 achieved a 330× quantization effect,compared to 27 times in the 2016 deep compression.The compression ratio has a significant effect improvement.The research of this thesis has great theoretical application value in the application scenarios with high energy requirements of convolutional neural networks.The improved algorithm of this paper has a good application effect in the occasion of high DRAM demand in chip design,which can greatly study the energy consumption of the algorithm.For the field of machine vision research,such as face recognition,image capture,light source detection and other occasions have great practical prospects.
Keywords/Search Tags:DNN, quantization, neural network, prune, compression
PDF Full Text Request
Related items