Font Size: a A A

Research On FPGA-based RTL-level Convolutional Neural Network Computing System

Posted on:2022-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:H Y LiFull Text:PDF
GTID:2518306527469954Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous development of deep learning technology,convolutional neural network has shown great advantages in multiple application fields such as image recognition,scene classification,and speech recognition.However,when solving more abstract and complex problems,the convolutional neural network itself becomes more complex,with larger network scale and higher computational power requirements.How to accelerate the calculation speed of the convolutional neural network,so that it can be better deployed in the mobile terminal to reduce the delay and increase the system performance has become an urgent problem to be solved.FPGA as a kind of programmable logic device,is very suitable for deployment on edge computing mobile terminals due to low power consumption,small size and its strong performance.At the same time,it's hardware circuit implementation algorithm and parallel computing characteristics are very suitable with the operation mode of convolutional neural networks,and it is an ideal platform for realizing the deployment of the convolutional neural network on mobile terminal.The thesis firstly introduces the basic principles of convolutional neural networks,selects three typical convolutional neural networks for analysis,then conducts an in-depth research on the FPGA calculation methods and optimization methods of convolutional neural networks,and explores the effective calculation mode of convolutional neural network deployed on the FPGA hardware platform.The specific work includes:1.A convolution kernel model with reconfigurable weights is proposed.The model uses a weight register with a data interface.Before each convolution calculation starts,the weight data stored in the external is imported into the convolution kernel,and the convolution kernel is reset when the calculation is completed to achieve the purpose of reusing the convolution kernel and minimizing the number of convolution kernels that needs to be solidified in the convolutional neural network,which reduces the requirement of the large-scale convolutional neural network for the internal resources of the FPGA.2.A new serial data flow model is used to optimize the control method of data flow in convolution calculation,reduce the proportion of clock cycles used to read memory data,and increase computational efficiency.At the same time,combined with the dynamic data caching,a convolution calculation can be completed within one clock cycle,which reduces the memory capacity required for caching intermediate data.3.The convolutional layer and the pooling layer are abstracted into a two-stage pipeline,and the pooling calculation is performed at the same time as the convolution calculation,which breaks the serial relationship between the layers in the calculation process of the convolutional neural network,increases the data stream processing speed of the system,and further improve the computing performance of the system.4.Taking the Lenet5 network as an example,a convolutional neural network computing system for handwriting recognition is built on the ZYNQ platform,which realizes the programming and simulation of each module in the convolutional neural network,and finally using 16-bit data to quantify the network,the classification test of the MNIST handwritten digital data set achieves good results.The classification accuracy rate was 97.6% and under the 100 MHZ system,the calculation speed is 1.15 times of the GPU,and the overall power consumption of the system is 2.3% of the GPU.It satisfies the performance and power consumption requirements of the convolutional neural network deployed on the mobile terminal,and has positive significance for the practical use of the convolutional neural network system.
Keywords/Search Tags:Convolutional Neural Network, FPGA Hardware Acceleration, RTL level calculation optimization
PDF Full Text Request
Related items