Hardware Acceleration And Implementation Of GoogLeNet Network Based On Sparse Convolution

Posted on:2021-05-18

Degree:Master

Type:Thesis

Country:China

Candidate:Z H Bai

Full Text:PDF

GTID:2428330614970926

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

With the development of artificial intelligence,convolutional neural network has become a hot research area.However,due to the high complexity of convolutional neural network,the traditional CPU can not meet the real-time requirements.Although GPU is widely used in network training,it can not meet the embedded application requirements because of its high power consumption.Therefore,FPGA with its low power consumption,reconfigurable and low delay characteristics is gradually called the research hotspot.At present,the traditional method of deploying convolutional neural network on FPGA is mainly to build a large-scale multiplication and accumulation array.The maximum performance of this method is limited by the number of multiplier units on FPGA and can not obtain higher performance by using the feature of parameter redundancy.Thisis innovatively use the following methods to solve these problems base on GoogLeNet:(1)Thisis proposes a multi-dimensional algorithm compression framework,which includes pruning,clustering and quantization to lighten the GoogLeNet to solve the problems of large computation and large parameters.Based on pruning rate and distribution of different convolution parameters in GoogLeNet,adjust pruning threshold dynamically and remove unimportant parameters.Thisis uses K-Means clustering algorithm to cluster convolution kernel weight parameters.And sets different clustering categories to achieve the best clustering effect according to the size of convolution kernel and the number of non-zero parameters.At last,thisis uses Ristretto algorithm to reduce the storage space of GoogLeNet by 8-bit quantization.The experimental results show that after using three compression methods,the storage space of the GoogLeNet network model is reduced to one tenth of the original model,and the calculation amount is reduced to one quarter of the original model.(2)Based on the OpenCL heterogeneous computing framework,and combining with the compressed GoogLeNet model and the ABM-Sp Conv sparse convolution algorithm proposed by the research group,thisis designs the hardware architecture for GoogLeNet.Thisis decouples the addition and multiplication into two stages.Firstly,the feature map data corresponding to the weight is accumulated in addition unit,and then multiply by weight in multiplication unit.It can reduce the amount of hard core in multiplier.Thisis proposes to fuse BN layer and convolution layer to reduce deployment difficulty and codes the weight parameter to solve the problem of low efficiency of memory access.At last,designs a complete design space exploration process.Through the theoretical modeling and analysis of resources,frequency and performance,gets the optimal performance of this architecture on the target board.This work deploys the GoogLeNet on Arria 10 GX FPGA and gets a excellent result.Under the optimal parameter configuration,takes 3.4ms to recognize a picture and the throughput is 1456 GOPS.Its energy efficiency ratio is 34 times CPU and 4 times GPU.This work compares with the previous optimal architecture,double the speed and three times more throughput.

Keywords/Search Tags:

Sparse convolution, GoogLeNet, OpenCL, FPGA, Image Recognition, Heterogeneous Computing

PDF Full Text Request

Related items

1	Design And Implementation Of Advertising Image Recognition System Based On Heterogeneous Computing
2	Deep Convolution Algorithm Optimization And Hardware Acceleration
3	Research Of FPGA Heterogeneous Computing Method Based On OpenCL
4	The Research And Implement Of Video Image Recognition Based On Heterogeneous Computing Platform
5	Research And Implementation Of Key Algorithms In Image Information Extraction Based On Heterogeneous Computing Systems
6	Investigation Of Heterogeneous Acceleration Method For Image Dehazing Based On FPGA+CPU
7	The Graphing Of Mathematics Image Base On Heterogeneous Computing With OpenCL
8	Research On FPGA-based Heterogeneous Computing Platform And Kernel Programming Design
9	Research And Implementation Of Heterogeneous Computing Based On FPGA
10	Design And Implementation Of OpenCL Host Framework Based On Heterogeneous Embedded Environment