Research On The Compression And Hardware Acceleration Based On Convolutional Neural Network

Posted on:2020-04-20

Degree:Master

Type:Thesis

Country:China

Candidate:H H Wu

Full Text:PDF

GTID:2428330599452876

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

Convolution Neural Network(CNN)has developed rapidly in the fields of image,speech and face recognition,especially in the field of image.In reality,the application of traditional algorithms often relies on embedded platforms with small size and low power consumption.Convolutional neural networks are characterized by huge parameters,complex network models and large amount of computation.These characteristics make it difficult for convolutional neural networks to run fast in embedded end.In view of the huge parameters and complex network model of convolutional neural network,this paper proposes network tailoring and weight quantization to compress the network.At the same time,in view of the large amount of computation of convolutional neural network,the calculation is accelerated by using FPGA.The convolutional neural network studied in this paper is Tiny-yolo.Firstly,the connection relationship of Tiny-yolo network is analyzed,the connection with smaller weight is cut down,the number of weight is reduced,and the network compression is realized.The weight matrix after pruning adopts the sparse storage method to reduce the memory occupation of network model;the sparse network is retrained to achieve the purpose of compression while ensuring the network.The recognition accuracy will not decrease dramatically before and after tailoring.Secondly,the weights are quantified.In this paper,the original 32-bits floating-point data type of Tiny-yolo is quantified to 8-bits signed char data type,which further reduces the memory occupancy and computational complexity of the model within the range of ensuring accuracy error.Finally,according to the structure characteristics of Tiny-yolo network,a deep parallel-pipelining FPGA addition is proposed.The fast optimization scheme accelerates the data caching and convolution operation,and finally realizes the fast running of Tiny-yolo network on the embedded end.The experimental results show that the number of parameters is reduced by 90% and the memory occupied by the model is changed from 63.5 MB to 4.55 MB on the premise of ensuring the accuracy of network identification.Quantitative realization of compression ratio of about 4X will result in loss of network accuracy and decrease of mAP by 2 percentage points,but it has little effect on the final test results.Hardware acceleration optimization,compared with the operation on ARM Cortex-A9 with the maximum frequency of 667 MHZ,achieves about 7X operation acceleration.

Keywords/Search Tags:

neural network, compression, hardware acceleration, FPGA

PDF Full Text Request

Related items

1	High Performance Artificial Intelligence Computing With Algorithm-hardware Co-design
2	Research On FPGA Acceleration Of Neural Network Algorithm
3	Research And Implementation Of Convolutional Neural Network Acceleration Method Based On FPGA
4	Acceleration System Design And Implement For Convolutional Neural Network Based On SOC FPGA
5	Design And Implementation Of Convolutional Neural Network Acceleration Based On FPGA
6	Design Of Convolutional Neural Network Acceleration System And FPGA Verification
7	Neural Network Compression And Acceleration For FPGA Implementation
8	Research And Implementation Of Acceleration Of Binary Convolutional Neural Network Based On FPGA
9	Research On CNN Network Acceleration For Image Classification Based On FPGA
10	Research On Hardware Acceleration Based On FPGA Of Convolutional Neural Network And Elliptic Curve Algorithm