Font Size: a A A

Implementation And Verification Of Convolutional Neural Network Based On FPGA

Posted on:2022-05-29Degree:MasterType:Thesis
Country:ChinaCandidate:X W XiaoFull Text:PDF
GTID:2518306485956699Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Convolutional neural networks(CNN)with a huge amount of calculations are difficult to meet the requirements of real-time and low power consumption on general-purpose processors(CPU),while dedicated image processors(GPU)are currently the mainstream processing used to calculate CNN However,it still has problems such as high power consumption and insufficient real-time performance.FPGA is a programmable logic device that has the advantages of low computational delay,high parallelism,and low power consumption.Although the use of FPGAs to accelerate CNN has received increasing attention,it is still in the early stages of development.The deployment of CNN on FPGAs still faces difficulties in hardware debugging,development difficulties,and hardware resources that are difficult to support large-scale CNN algorithms.Therefore,this article starts with the deployment of small-scale CNN,and conducts an exploratory research on FPGA implementation of CNN.This article first analyzes and studies the principles,module characteristics and computational complexity of convolutional neural networks.Aiming at the convolutional layer calculations that are frequently used,computationally complex,and time-consuming,a general convolutional layer calculation accelerator is designed,and the network weighting research is carried out through the classification calculation accuracy analysis,and a more general FPGA-based adaptation is realized.CNN high-speed unit with a mainstream network structure.Based on this unit,the simulation realization of the improved LeNet5 network is carried out,which shows that the calculation acceleration can almost maintain the accuracy of the network classification while greatly reducing the amount of network calculation.The accelerator only uses the PL end of the FPGA,which is a purely logical implementation.In this paper,combined with FPGA hardware resources,the improved LeNet5 network parallel computing is researched and analyzed,and finally the entire network is completed with a parallelism of 32 on the FPGA.The research method in this paper is compared with the results of CPU,GPU,and other documents to achieve CNN,and they all have certain advantages.The accelerator designed in this paper achieves97.99% recognition accuracy,which is less than 1% compared to the existing implementation methods.In terms of computing performance,the accelerator implemented in this paper reaches 7.75GOP/s,which is 32.3 times that of a general-purpose CPU,and its energy consumption is only 0.65% of the CPU.In terms of calculation speed,the accelerator implemented in this paper only needs 38 us to calculate a picture,which is 37.7 times and 3.6 times that of CPU and GPU respectively.Compared with the existing literature in design frequency,power consumption,and energy efficiency ratio,they all have certain advantages.
Keywords/Search Tags:Convolutional Neural Network, Hardware Acceleration, FPGA, Improved LeNet5
PDF Full Text Request
Related items