Research And Implementation Of FPGA-based Accelerating Methods For Convolutional Neural Network

Posted on:2019-01-27

Degree:Master

Type:Thesis

Country:China

Candidate:Y Qiu

Full Text:PDF

GTID:2428330548476164

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the widespread use of deep learning technology in many fields,Convolutional Neutral Network as its basic model has attracted more and more attention.CNN is widely used in image classification,face recognition,language detection,document analysis and so on.But only using software to speed up CNN is unable to meet the growing speed and power requirements,how to design a CNN accelerator with hardware has become the research focuses in the academic fields.As a parallel computing intensive acceleration hardware,FPGA has excellent performance power ratio,and has unique advantages compared to GPU and ASIC.But in practical use,how to efficiently use the limited FPGA resources on chip to achieve higher performance with less resources,how to design the architecture of FPGA hardware module or the more general there are enormous challenges.This paper presents an efficient convolution module ECM,it contains 4 PE units,and each PE unit is responsible for the calculation of an output feature map.Convolution data and convolution parameters are passed between PE units by cascade concatenation.In order to solve the problem of repeatedly reading and writing external registers in layer serial mode,double buffer storage mechanism is used to store the intermediate computing results into the FPGA chip.In addition,the data caching and distribution mode of the input register,and the internal structure of the PE unit and the pool module are designed.The whole efficient convolution module is responsible for the management and scheduling of each unit by the ECM control module.According to the common characteristics of the convolution neural network,the general general architecture of the CNN hardware accelerator based on FPGA is designed.This architecture solves the problem of reducing the overall computing speed by repeatedly reading and writing off chip memory,and improves the level serial mode because of the uneven distribution of computation,which wastes DSP resources.In addition,the whole architecture contains many groups of efficient convolution modules,which share convolution data by broadcast mode.On the one hand,the whole control module is responsible for the interaction with the PS terminal to achieve the command.On the other hand,it is responsible for the control of the whole operation process.Finally,based on the general architecture proposed in this paper,the FPGA hardware accelerator is implemented in combination with the ZynqNet model.In order to further improve the speed,this paper reduces the computational accuracy of ZynqNet from 32 bits to16 bits,thus a parallel structure of 64 PE units is designed to improve computing parallelism.The ImageNet results show that the optimized accelerator based on FPGA can achieve 10 times speedup compared to the original ZynqNet,and 20 times speedup compared to i5-5200 U CPU.In terms of performance power ratio,the FPGA accelerator is 5.4 times of NVIDIA GTX 970 GPU version.

Keywords/Search Tags:

Convolutional Neutral Network, FPGA, general architecture, ZynqNet, acceleration

PDF Full Text Request

Related items

1	FPGA-based Accelerator For Convolutional Neural Network
2	Design And FPGA Implementation Of Convolutional Neutral Network Acceleration Module
3	Design Of General-purpose Deep Convolutional Neural Network Accelerator Based On FPGA
4	Research On CNN Network Acceleration For Image Classification Based On FPGA
5	Research On Parallel Acceleration Architecture Convolutional Neural Network Based On FPGA
6	Design And Implementation Of Convolutional Neural Network Acceleration Based On FPGA
7	Research Of Acceleration Technology For Convolutional Neural Networks Based On FPGA
8	Design Of Convolutional Neural Network Acceleration System And FPGA Verification
9	Research On Acceleration Scheme Of Convolutional Neural Network Based On CPU-FPGA Heterogeneous Computing
10	Research And Implementation Of Acceleration Of Binary Convolutional Neural Network Based On FPGA