Research On Quantification Pruning And Method Of Convolutional Neural Network

Posted on:2020-09-02

Degree:Master

Type:Thesis

Country:China

Candidate:J Miao

Full Text:PDF

GTID:2428330590474316

Subject:IC Engineering

Abstract/Summary:

PDF Full Text Request

With the development of artificial intelligence,deep learning networks are widely empolyed with various artificial intelligence integration solutions.As deep learning networks become deeper and deeper,learning networks become more complex and redundant.How to improve the hardware power efficiency with neural networks has become a hot research topic in deep learning.GPU have a big advantages over CPUs in terms of multitasking parallel processing because of their multi-stream processing unit modules,although this advantage is often built on the big amount computing units and storage units.But in embedded devices,the overhead of computing units and storage units is often limited,especially for storage units with large energy consumption,such as DRAM.Therefore,the corresponding memory and computational overheads are optimized is essetial for algorthm deploy.The use of pruning and quantization operations greatly reduced the memory requirements of deep neural networks in embedded devices.The improvement of this topic is as follows:The major flaw in the traditional pruning operation.After the network pruning,it is impossible to evaluate the impact of the pruned network on the overall connection.It is necessary to suppress the impact of the pruning operation on the overall recognition rate by suppressing the compression ratio.Will result in inefficient pruning.In view of the defect of pruning operation,this paper introduces Yiwen Guo's work in pruning,and introduces the splicing operation in the pruning algorithm to recover the network that has been clipped.In the experiment,the Lenet-5 network of the classic convolutional neural network achieved 38.4 times compression,and the depth of the Alexnet network also achieved 12.7 times compression.The problem in the traditional clustering algorithm that the cluster center will shift during the retraining process.In this thesis,the clustering algorithm is improved.Then the probability center is used to quantify the weights,and the error accumulation of weights in the retraining process is realized,which improves the quantization efficiency.And the whole training is grouped according to the research in the field of quantification,so that the effect of the overall convergence is enhanced.In the experiment,the result of the 0.12% error on the whole achieved a 31-fold quantization effect on the Alexnet network,and the Lex-5 achieved a 330� quantization effect,compared to 27 times in the 2016 deep compression.The compression ratio has a significant effect improvement.The research of this thesis has great theoretical application value in the application scenarios with high energy requirements of convolutional neural networks.The improved algorithm of this paper has a good application effect in the occasion of high DRAM demand in chip design,which can greatly study the energy consumption of the algorithm.For the field of machine vision research,such as face recognition,image capture,light source detection and other occasions have great practical prospects.

Keywords/Search Tags:

DNN, quantization, neural network, prune, compression

PDF Full Text Request

Related items

1	Deep Convolutional Neural Networks Compression Based On Sparsity And Quantization
2	Implementation With FPGA And Compression Of Faster R-CNN Object Detection Network Algorithm
3	Research On Compression Algorithm Of Neural Network Based On Combined Ternary Quantization
4	Research On Lip Recognition Model Compression Based On Deep Learning
5	Deep Neural Network Compression Method Based On Product Quantization
6	Mixed-precision Quantization Methods For Convolutional Neural Network Compression
7	Research On Accelerating Algorithm Of Neural Network Based On Quantization
8	Research On Application Of Neural Network Compression And Acceleration Based On Quantization
9	Study Of Low Bit-width Quantization Of Deep Convolutional Neural Network
10	Applications Of Neural Networks To Image Compression