Design And Verification Of DNN Compression Algorithm Based On Structure Pruning

Posted on:2022-01-11

Degree:Master

Type:Thesis

Country:China

Candidate:Y B Fan

Full Text:PDF

GTID:2518306740993879

Subject:IC Engineering

Abstract/Summary:

PDF Full Text Request

As a popular research direction in the field of artificial intelligence,deep convolutional neural networks have been widely used in target recognition,speech segmentation and other fields.However,its own huge parameters and calculations limit the deployment of deep convolutional neural networks on small hardware platforms such as mobile devices.Therefore,it is very important to design an efficient DNN compression algorithm to greatly reduce the parameters and calculations when ensuring the accuracy of the model.In this thesis,a DNN compression algorithm is designed based on structure pruning,which greatly reduces the parameters and calculations of the network when ensuring the accuracy of the network.The algorithm combines the grouped channel pruning with the power exponential quantization to make full use of the network sparsity structure and data integration capabilities brought by the combination of the two.The network accuracy and compression rate brought by the pruning and quantization algorithms are respectively controlled by designing hyperparameters2) and (7 to adapt to different hardware deployment requirements.Determine the sequence of network layer compression through sensitivity analysis and grouping strategy on the DNN network,so that the network hardly produces accuracy loss during the compression process.In this thesis,the DNN compression algorithm based on structure pruning has been tested on single-branch and multibranch networks.The error rate of running a single branch network Alex Net on the Image Net dataset only increased by 0.95%,and the error rate of running VGGNet on the CIFAR10 dataset only increased by 0.52%.The network parameters are compressed by 11 times,while the calculations are reduced by 75%.When the error rate of running the multi-branch network Res Net on the Image Net dataset increases by 3.09%,the parameters are compressed by 15 times,and the calculations are reduced by 72.6%.Since the compressed sparsity model parameters are composed of discrete data sets composed of 0 and powers of 2,it can use shifters instead of traditional multipliers to perform operations such as multiplication and accumulation when deployed in hardware.Therefore,the computing resource consumption of the network on the hardware can be reduced.A corresponding FPGA verification system based on the compression of the VGG16 on the CIFAR10 is designed and the control code on the Zynq XC7Z100 chip is written.When the working frequency is 150 MHz,the power consumption is only 2.55 W,which has completed the verification of the algorithm.

Keywords/Search Tags:

Convolutional Neural Network, Group Structure Pruning, Power Exponential Quantization, Convolutional Neural Network Accelerator

PDF Full Text Request

Related items

1	Compression Algorithm And Circuit Design Of Convolutional Neural Networks
2	Research On Convolutional Neural Network Accelerator For Mobile Terminals
3	Design And Research Of Sparse Convolutional Neural Network Accelerator On FPGAs
4	Design And Implementation Of Convolutional Neural Network Accelerator Based On Affine Quantization
5	Circuit Design Of DNN Accelerator For Structure Pruning Compression Algorithm
6	Compress And Accelerate Deep Convolutional Neural Networks Via Group-based Pruning Methods
7	Convolutional Neural Network Model Compression And Inference Acceleration Based On Look Up Table
8	Convolutional Neural Network Accelerator Modeling Based On NOC Structure
9	Study On Convolutional Neural Network Compression Methods Based On Pruning And Quantization
10	Implementation And Research Of FPGA-based Convolutional Neural Network Accelerator