Font Size: a A A

Design Of Hardware Accelerator Based On FPGA For Convolutional Neural Networks

Posted on:2020-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:B J LiFull Text:PDF
GTID:2518306518470164Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of artificial intelligence,the Convolutional Neutral Network(CNN)plays an increasingly important role in image processing and target detection.However,in many cases,CPUs and GPUs have many drawbacks when dealing with these big data.They have the disadvantages of slow speed,high cost,and large power consumption,which cannot meet the needs of CNN low-power and lowlatency reasoning.Field-Programmable Gate Array(FPGA)can easily solve these problems.FPGA has low development cost,high flexibility,parallel computing,low power consumption,etc,which perfectly fits the calculation of CNN algorithm demand.This paper analyzes and studies the existing CNN model,and analyzes the internal calculation principle of convolutional layer,pooling layer,full connectivity layer batch normalization and activation function in CNN,and makes specific research on the current mainstream CNN model.Based on the related research,an FPGA-based convolutional neural network accelerator is designed.The accelerator implements parallelization calculations in four dimensional directions in the convolution operation.A parametric architecture design is proposed.Under the three parameters,a single clock cycle can complete 512,1024,2048 multiply and accumulate respectively.An on-chip dual-buffer structure is designed to reduce off-chip memory access while achieving efficient data multiplexing.The complete convolutional neural network single-layer operation process is realized by using the pipeline,which improves the operation efficiency of the accelerator.In the comparison experiment,the Ventex-7 2000 T FPGA is used to accelerate the CNN inference calculation,and compared with the CPU,GPU and related FPGA acceleration schemes.The network used in the comparison experiment is VGG16 and Face Alignment.The experimental results show that the VGG16 network inference calculation The accelerator design proposed in this paper has a calculation speed of560.2 GOP/s under the maximum parameter condition,which is 8.9 times that of the i7-6850 K CPU.At the same time,its calculated performance-to-power ratio is 3.0 times that of the NVDIA GTX 1080 Ti GPU.Face Alignment network reasoning,the accelerator achieved a speed of 306.8 GOP / s under the maximum parameter conditions.Compared with related research,the accelerator designed in this paper achieves high performance and performance-to-power ratio while ensuring versatility.
Keywords/Search Tags:Convolutional Neural Network, FPGA, Hardware Accelerator, Parameterized Architecture, Pipeline
PDF Full Text Request
Related items