Font Size: a A A

Design And Verification Of DNN Compression Algorithm Based On Structure Pruning

Posted on:2022-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y B FanFull Text:PDF
GTID:2518306740993879Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
As a popular research direction in the field of artificial intelligence,deep convolutional neural networks have been widely used in target recognition,speech segmentation and other fields.However,its own huge parameters and calculations limit the deployment of deep convolutional neural networks on small hardware platforms such as mobile devices.Therefore,it is very important to design an efficient DNN compression algorithm to greatly reduce the parameters and calculations when ensuring the accuracy of the model.In this thesis,a DNN compression algorithm is designed based on structure pruning,which greatly reduces the parameters and calculations of the network when ensuring the accuracy of the network.The algorithm combines the grouped channel pruning with the power exponential quantization to make full use of the network sparsity structure and data integration capabilities brought by the combination of the two.The network accuracy and compression rate brought by the pruning and quantization algorithms are respectively controlled by designing hyperparameters2) and (7 to adapt to different hardware deployment requirements.Determine the sequence of network layer compression through sensitivity analysis and grouping strategy on the DNN network,so that the network hardly produces accuracy loss during the compression process.In this thesis,the DNN compression algorithm based on structure pruning has been tested on single-branch and multibranch networks.The error rate of running a single branch network Alex Net on the Image Net dataset only increased by 0.95%,and the error rate of running VGGNet on the CIFAR10 dataset only increased by 0.52%.The network parameters are compressed by 11 times,while the calculations are reduced by 75%.When the error rate of running the multi-branch network Res Net on the Image Net dataset increases by 3.09%,the parameters are compressed by 15 times,and the calculations are reduced by 72.6%.Since the compressed sparsity model parameters are composed of discrete data sets composed of 0 and powers of 2,it can use shifters instead of traditional multipliers to perform operations such as multiplication and accumulation when deployed in hardware.Therefore,the computing resource consumption of the network on the hardware can be reduced.A corresponding FPGA verification system based on the compression of the VGG16 on the CIFAR10 is designed and the control code on the Zynq XC7Z100 chip is written.When the working frequency is 150 MHz,the power consumption is only 2.55 W,which has completed the verification of the algorithm.
Keywords/Search Tags:Convolutional Neural Network, Group Structure Pruning, Power Exponential Quantization, Convolutional Neural Network Accelerator
PDF Full Text Request
Related items