Font Size: a A A

Research On Neural Network Compilation And Acceleration Technology Based On FPGA

Posted on:2021-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:J N JiangFull Text:PDF
GTID:2518306503974619Subject:IC Engineering
Abstract/Summary:PDF Full Text Request
Artificial intelligence technology plays a vital role in many application fields.More and more computing platforms begin to attach importance to the deployment and optimization of neural network algorithms.At present,neural network algorithm is characterized by high computational complexity,and fast update and change.How to efficiently and quickly deploy neural network algorithms to various devices from the cloud to the edge and implement hardware acceleration is currently a research hotspot in the industry and academia.This paper studies the compilation,deployment,and accelerated optimization of neural network algorithms in heterogeneous acceleration platforms.Firstly,a heterogeneous hardware acceleration platform based on Intel Arria 10 FPGA and Xeon processor and an open source neural network software compilation platform based on model optimizer and inference engine are built.Secondly,a general neural network accelerator based on Open CL is designed.The influence of different precision on the accuracy and speed of neural network is analyzed.According to the calculation characteristics of five common networks,the design of general accelerator is optimized.Finally,an optimization method for the acceleration platform is proposed: by tracking the dataflow changes between FPGA and processor,the hardware and software partitioning of the acceleration platform is optimized,reasonably allocate the execution hardware of each operation to reduce the data transmission between the processor and FPGA;through operator fusion,the adjacent computing layers are fused to reduce the overhead of data access and storage.The experimental results show that the heterogeneous acceleration platform in this paper can realize the deployment of different neural network algorithms,and the accelerator designed can accelerate five kinds of networks up to 2.3TFLOPs.The optimization acceleration method proposed in this paper can effectively improve the inference speed of neural network.Among the three algorithms of Res Net,Inception,Mobile Net,the acceleration system in this paper can reduce the number of layers after conversion by 45% compared with the original number of layers by fusing the calculation of training files.For Res Net,Mobile Net,Inception,the inference time after optimizing the dataflow is reduced by more than 50%.For Res Net,Inception,the acceleration ratio after optimizing the accelerator structure can reach more than 5 times.For Res Net,Inception,VGG,using FP11 improves the throughput by more than 2 times compared with FP16 when the loss of image classification accuracy rate is not higher than 1%.
Keywords/Search Tags:Artificial Intelligence, Heterogeneous acceleration, General accelerator, Operator fusion, Dataflow optimization
PDF Full Text Request
Related items