Compression Algorithm And Circuit Design Of Convolutional Neural Networks

Posted on:2020-10-03

Degree:Master

Type:Thesis

Country:China

Candidate:W W Pan

Full Text:PDF

GTID:2428330626950789

Subject:Integrated circuit engineering

Abstract/Summary:

PDF Full Text Request

Deep convolutional neural networks are the foundation for modern artificial intelligence applications,and has achieved great success in the fields of image and speech.However,the Convolutional Neural Networks models are computationally expensive and memory intensive,hindering their deployment in the mobile devices.Therefore,the research of the CNN compression algorithm and hardware acceleration circuit in this thesis have important values.Common CNN compression algorithms and accelerators are summarized in this thesis.In view of the large amount of parameters and high computational complexity of CNN in practical applications.Specifically,a three-stage pipeline are desiged: channel pruning,full-connected layer pruning,and quantization.In order to improve precision,the training loss is optimized for channel pruning.On the CIFAR-10 dataset,the method reduced the storage required achieves 18×,with only 2.79% loss of accuracy on VGGNet.In addition,an energy efficient reconfigurable accelerator is designed in this thesis.The memory access structure of the accelerator is optimized according to the convolution calculation characteristics.Besides,in order to achieve high energy efficiency,the accelerator support to reconfigures its data paths to map different data reuse pattern.This thesis is based on Virtex-7 FPGA to complete the CNN accelerator FPGA design.AlexNet and VGGNet are used as benchmarks to verify the performance of the accelerator and the algorithm on inference acceleration.On the benchmark of AlexNet,the test results of the system show that the acceleration achieves 89 GOPS thoughput on average,unit DSP performance achieves 0.18 GOPS at 100 MHz.Besides,the recognition speed reached 47.6fps with VGGNet when identifying 32×32 test images,and the compressed model recognition speed reached 61.4fps.The designed in this thesis has the characteristics of low storage requirements,good real-time performance and strong configurability,which meets the requirements of deploying convolutional neural networks for mobile devices.

Keywords/Search Tags:

Convolutional Neural Network, Pruning, Quantization, Convolutional Neural Network Accelerator, Reconfigurable Architecture

PDF Full Text Request

Related items

1	Design And Verification Of DNN Compression Algorithm Based On Structure Pruning
2	Compression Research Of Convolutional Neural Network And FPGA Design Verification
3	Research On Convolutional Neural Network Pruning Technology
4	Research And Design Of Convolutional Neural Network Accelerator Based On Heterogeneous SOC
5	Research On Convolutional Neural Network Accelerator For Mobile Terminals
6	Research On Key Technologies Of Reconfigurable Neural Network Accelerator Design
7	Design And Research Of Sparse Convolutional Neural Network Accelerator On FPGAs
8	Design Of General-purpose Convolutional Neural Network Accelerator Based On FPGA
9	Design And Implementation Of Convolutional Neural Network Accelerator Based On Affine Quantization
10	VLSI Architecture Design For Binary Convolutional Neural Network Accelerator