Font Size: a A A

Design Of Convolutional Neural Network Acceleration System And FPGA Verification

Posted on:2021-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:X Z ShenFull Text:PDF
GTID:2518306557490244Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
Convolutional neural network(CNN)is an important model in the field of deep learning.Because of its high adaptability to image translation,scaling,rotation and other forms of deformation,CNN has unique advantages in solving high-level abstract cognitive problems,and is widely used in many fields.However,the computationally intensive nature of CNN makes it difficult to deploy directly on mobile terminal devices.There are still huge challenges in how to efficiently use limited resources to obtain higher acceleration performance,and how to design more efficient hardware computing architectures.Based on the research of CNN structure,the traditional convolution computing method is improved at the algorithm level,and a new convolution kernel decomposition method is used to compress the network model,which reduces the number of network parameters.At the hardware level,the circuit design of the CNN computing module and the collaborative processing module is completed,and the CNN hardware acceleration system is constructed.In the CNN computing module,according to CNN independent channel and weight sharing characteristics,a high-efficiency parallel pipeline computing architecture is designed,which realizes the multiplexing of data path of the convolutional layer and the fully connected layer,reducing the computational complexity of the circuit.The collaborative processing module mainly includes register configuration,data handling,ping-pong cache,and flow control.The data handling and ping-pong cache module reduces the acceleration system's demand for external storage bandwidth.The flow control module avoids the performance degradation caused by frequent intervention of external processors during system operation,and further improves the computing efficiency of the acceleration system.This system has been verified on the FPGA development board with Zynq XC7Z020 as the core chip.The test results indicate that at a working frequency of 100 MHz,the time required to identify a64×64 gesture digital picture is about 0.38 ms,a single multiplier provides 0.16 GOPS performance,and the performance power consumption ratio reaches 11.2GOPS/W,which meets the design specifications.
Keywords/Search Tags:Convolutional Neural Network, Hardware Acceleration, Parallel Pipeline, FPGA
PDF Full Text Request
Related items