Design And Implementation Of Convolutional Neural Network Accelerator Based On FPGA

Posted on:2020-09-29

Degree:Master

Type:Thesis

Country:China

Candidate:Y Z Tong

Full Text:PDF

GTID:2428330596476222

Subject:Microelectronics and Solid State Electronics

Abstract/Summary:

PDF Full Text Request

With the continuous development of deep learning,Convolutional Neural Network(CNN)has become the gold standard for many applications in the field of computer vision.It has been widely used in character recognition,face recognition,image and video classification,scene analysis and other fields.Since CNN contains a large number of operations,it is a computationally intensive algorithm,the general processor cannot fully exploit its parallelism and it is difficult to meet its real-time requirements.At present,CNN is mainly implemented by GPU,but the high power consumption of GPU makes it unsuitable for application on mobile platforms.FPGA(Field-Programmable Gate Array)is a programmable device with flexible configurability,rich computing resources and short development cycle.It is more and more popular to develop CNN accelerator based on FPGA.This paper studies and designs a modularized convolutional neural network accelerator based on FPGA.It uses hardware description language to realize an accelerator with high throughput,high parallelism.This paper firstly introduces the basic concepts and principles of artificial neural networks,convolutional neural networks and the process of software and hardware collaborative design.Then the key problems in the CNN accelerator design process are analyzed,especially the parallelism of the convolution operation.The implementation of each parallel feature is proposed and analyzed from the perspective of resource occupancy and hardware implementation.Meanwhile,the activation function and parameter precision of the network are analyzed and selected.Based on above,this paper designs an FPGA-based CNN accelerator.The accelerator uses software and hardware collaborative design,adopts CPU+FPGA computing framework.On the hardware side,forward calculation of CNN model was implemented.On the software side,data transmission and control were processed.In the hardware design,the accelerator is divided into modules.The convolutional layer module,the pooling layer module,the buffer area,and the AXI interface are designed.This CNN accelerator implements the forward calculation of the LeNet-5 model.Each layer of the accelerator uses a deep pipeline structure to improve the overall throughput rate.The multiply accumulator and the input and output buffers between the layers are reused to reduce resource consumption.On the software side,the control interaction and data exchange logic from CPU to the CNN accelerator are designed.Combined with the design of the hardware and software,the recognition of the handwritten digits of the MNIST dataset was tested,the test obtained a very good classification result.The accuracy rate was 98.5%,only 0.5% loss compared to software realize.The computing performance reached 10.07 GOP/s,which is 11.7 times and 1.1 times of the general-purpose CPU and the general-purpose GPU respectively;in terms of power consumption,it is only 4.3% and 3.1% of the general-purpose CPU and GPU.Compared to OpenCL and HLS-based FPGA implementations,it also reached higher unit resource performance.

Keywords/Search Tags:

CNN, FPGA, Hardware accelerating

PDF Full Text Request

Related items

1	Research And Implementation Of FPGA Acceleration Method For Convolutional Neural Network
2	A Convolutional Neural Network Accelerating Circuit Design And FPGA Implementation
3	Accelerating Framework Of Transformer By Hardware Design And Model Compression Co-Optimization
4	Research On Accelerating Convolutional Neural Networks Via Eliminating Weight And Feature Redundancy
5	Research And Implementation Of Adaptive Multi-alphabet Arithmetic Hardware Coder With Sliding Window
6	Research On Key Techniques Of Hardware Acceleration Mechanism In On-site Video Enhancement Algorithms
7	Accelerating Explicit State Model Checking on an FPGA: PHAST
8	Design And Implementation Of Pci-express Dma Controller In The Fpga Hardware Accelerating System For Helical Cone-beam Ct 3d Reconstruction
9	Design And Implementation Of Pci-express DMA Controller In The FPGA Hardware Accelerating System For Helical Cone-beam CT 3d Reconstruction
10	Research And Design Of Convolutional Neural Network Accelerator Based On Multi-FPGA Co-acceleration