Font Size: a A A

Design And Implementation Of Convolutional Neural Network Accelerator Based On FPGA

Posted on:2020-09-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z TongFull Text:PDF
GTID:2428330596476222Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
With the continuous development of deep learning,Convolutional Neural Network(CNN)has become the gold standard for many applications in the field of computer vision.It has been widely used in character recognition,face recognition,image and video classification,scene analysis and other fields.Since CNN contains a large number of operations,it is a computationally intensive algorithm,the general processor cannot fully exploit its parallelism and it is difficult to meet its real-time requirements.At present,CNN is mainly implemented by GPU,but the high power consumption of GPU makes it unsuitable for application on mobile platforms.FPGA(Field-Programmable Gate Array)is a programmable device with flexible configurability,rich computing resources and short development cycle.It is more and more popular to develop CNN accelerator based on FPGA.This paper studies and designs a modularized convolutional neural network accelerator based on FPGA.It uses hardware description language to realize an accelerator with high throughput,high parallelism.This paper firstly introduces the basic concepts and principles of artificial neural networks,convolutional neural networks and the process of software and hardware collaborative design.Then the key problems in the CNN accelerator design process are analyzed,especially the parallelism of the convolution operation.The implementation of each parallel feature is proposed and analyzed from the perspective of resource occupancy and hardware implementation.Meanwhile,the activation function and parameter precision of the network are analyzed and selected.Based on above,this paper designs an FPGA-based CNN accelerator.The accelerator uses software and hardware collaborative design,adopts CPU+FPGA computing framework.On the hardware side,forward calculation of CNN model was implemented.On the software side,data transmission and control were processed.In the hardware design,the accelerator is divided into modules.The convolutional layer module,the pooling layer module,the buffer area,and the AXI interface are designed.This CNN accelerator implements the forward calculation of the LeNet-5 model.Each layer of the accelerator uses a deep pipeline structure to improve the overall throughput rate.The multiply accumulator and the input and output buffers between the layers are reused to reduce resource consumption.On the software side,the control interaction and data exchange logic from CPU to the CNN accelerator are designed.Combined with the design of the hardware and software,the recognition of the handwritten digits of the MNIST dataset was tested,the test obtained a very good classification result.The accuracy rate was 98.5%,only 0.5% loss compared to software realize.The computing performance reached 10.07 GOP/s,which is 11.7 times and 1.1 times of the general-purpose CPU and the general-purpose GPU respectively;in terms of power consumption,it is only 4.3% and 3.1% of the general-purpose CPU and GPU.Compared to OpenCL and HLS-based FPGA implementations,it also reached higher unit resource performance.
Keywords/Search Tags:CNN, FPGA, Hardware accelerating
PDF Full Text Request
Related items