Font Size: a A A

Acceleration System Design And Implement For Convolutional Neural Network Based On SOC FPGA

Posted on:2022-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:J X ShiFull Text:PDF
GTID:2518306314969579Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of artificial intelligence,deep learning algorithms have also risen rapidly.As a common deep learning algorithm,convolutional neural network has been widely used in many fields,such as face recognition,target detection,image and video classification,etc.However,as the complexity of the problem increases,the depth of the algorithm network increases sharply,and the amount of calculation becomes more and more huge.This makes the traditional CPU and GPU implementation methods appear to be somewhat different in some scenarios that require high real-time and low power consumption.Inadequate,such as drones,smart cameras and other mobile terminal fields.SOC FPGA not only integrates FPGA’s rich logic resources,but also carries an ARM processor.It has the advantages of design flexibility,high speed,low energy consumption,compactness and portability,which is very conducive to rapid deployment on mobile terminals.This subject is based on the SOC FPGA platform,selects the VGG16 network model,and uses a software-hardware collaborative design method to achieve a high-performance,low-power convolutional neural network acceleration system.Based on the full analysis of the basic principles of convolutional neural networks and the structural characteristics of the VGG16 network model,this subject divides the functions of the acceleration system according to the principle of software and hardware collaborative design,and proposes the overall architecture of the acceleration system.The acceleration system architecture is mainly composed of HPS software control module,FPGA hardware acceleration module,software and hardware data interaction module,and off-chip storage module.The HPS side adopts C language programming design,completes the serial terminal setting through Putty,adds peripherals based on the GHRD design,completes the Qsys system construction,and solves the input data loading,hardware control command sending and FPGA side operation result reading,etc.problem;FPGA side adopts Verilog HDL hardware description language programming design,realizes the main operation modules such as convolution,pooling,and full connection in the VGG16 network model through RTL,and performs comprehensive compilation in Quartus Prime to complete the hardware circuit design of the algorithm network;The data communication between HPS and FPGA is based on the AXI bus design,which includes three bus protocols:F2H_AXI_Slave,H2F_AXI_Master and H2F_LW_AXI_Master.In order to solve the problem of data inter-clock domain buffering,a dual-port asynchronous FIFO module is designed.In addition,in order to ensure sufficient data storage space,this design selects 64 GB SD Card as the onboard peripheral.Through Modelsim,each functional module and the system are simulated and verified,and finally deployed on the DE1-SOC development board.The task of identifying and classifying some images of the Image Net data set was completed,and the recognition rate could reach more than 80%.And the highest power consumption is only 4.8W.When the input image data size is 224×224,the processing rate can reach 218.76 FPS.Compared with general-purpose CPU platform,it has achieved 2 times acceleration in processing rate,and the highest power consumption is only about 1/7 of the CPU;Compared with the general GPU platform,the performance-to-power ratio(the ratio of processing rate to power consumption)is6 times that of the GPU platform;Compared with similar accelerators in recent years,it also has advantages in terms of processing speed and power consumption.
Keywords/Search Tags:Convolutional Neural Network, SOC FPGA, Hardware acceleration, Software and hardware collaborative design
PDF Full Text Request
Related items