Font Size: a A A

Simulation Implementation Of Deep Learning Software And Hardware Co-design Based On FPGA

Posted on:2022-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:W ZhouFull Text:PDF
GTID:2518306764471524Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
The complexity of deep learning network is increasing with the progress of research.As an important branch of deep learning,the amount of data and calculation of convolutional neural network is also increasing day by day,and various acceleration platforms have emerged.In order to enable the convolutional neural network to be applied in lowpower,high-parallel embedded scenarios,this paper is devoted to researching and implementing a deep learning software-hardware co-design method based on FPGA,and based on this,the YOLOV3-Tiny algorithm is implemented.After studying the YOLOV3-Tiny network structure,this paper merges the BN layer into the convolution kernel parameters and performs 16 bit fixed-point quantization operations.According to the respective advantages of software and hardware,this paper divides the tasks of software and hardware artificially.The software side is responsible for system control and tasks suitable for CPU serial processing,and the hardware side is responsible for computing-intensive and highly parallel tasks.For the PL side task,this paper uses the Vivado HLS high-level synthesis tool to realize the IP core design and packaging of the PL side acceleration module,and optimizes the storage method of the feature map and convolution kernel parameters,using multichannel transmission and ping-pong buffering to further The system delay is reduced? for the PS side task,this paper writes the corresponding software driver in the Vivado SDK environment,and realizes the picture input,data control and detection output modules respectively.Finally,this paper builds a software-hardware collaborative target detection system based on ZYNQ7035.Under the coco test set,the system performance reaches 27.52 GOPS,which is 33 times that of CPU(Interi5-9300H)0.16 times that of GPU(GTX 1660 Ti)and707 times that of ARM(Cortex A9).The performance-to-power ratio reaches 12.8GOPS/W.It has reached 58 times that of CPU(Interi5-9300H),984 times that of ARM(Cortex A9),and 25 times that of GPU(GTX 1660 Ti),achieving the purpose of low power consumption and high performance.
Keywords/Search Tags:deep learning, convolutional neural network, software-hardware co-design
PDF Full Text Request
Related items