Research On Neural Network Compilation And Acceleration Technology Based On FPGA

Posted on:2021-04-18

Degree:Master

Type:Thesis

Country:China

Candidate:J N Jiang

Full Text:PDF

GTID:2518306503974619

Subject:IC Engineering

Abstract/Summary:

PDF Full Text Request

Artificial intelligence technology plays a vital role in many application fields.More and more computing platforms begin to attach importance to the deployment and optimization of neural network algorithms.At present,neural network algorithm is characterized by high computational complexity,and fast update and change.How to efficiently and quickly deploy neural network algorithms to various devices from the cloud to the edge and implement hardware acceleration is currently a research hotspot in the industry and academia.This paper studies the compilation,deployment,and accelerated optimization of neural network algorithms in heterogeneous acceleration platforms.Firstly,a heterogeneous hardware acceleration platform based on Intel Arria 10 FPGA and Xeon processor and an open source neural network software compilation platform based on model optimizer and inference engine are built.Secondly,a general neural network accelerator based on Open CL is designed.The influence of different precision on the accuracy and speed of neural network is analyzed.According to the calculation characteristics of five common networks,the design of general accelerator is optimized.Finally,an optimization method for the acceleration platform is proposed: by tracking the dataflow changes between FPGA and processor,the hardware and software partitioning of the acceleration platform is optimized,reasonably allocate the execution hardware of each operation to reduce the data transmission between the processor and FPGA;through operator fusion,the adjacent computing layers are fused to reduce the overhead of data access and storage.The experimental results show that the heterogeneous acceleration platform in this paper can realize the deployment of different neural network algorithms,and the accelerator designed can accelerate five kinds of networks up to 2.3TFLOPs.The optimization acceleration method proposed in this paper can effectively improve the inference speed of neural network.Among the three algorithms of Res Net,Inception,Mobile Net,the acceleration system in this paper can reduce the number of layers after conversion by 45% compared with the original number of layers by fusing the calculation of training files.For Res Net,Mobile Net,Inception,the inference time after optimizing the dataflow is reduced by more than 50%.For Res Net,Inception,the acceleration ratio after optimizing the accelerator structure can reach more than 5 times.For Res Net,Inception,VGG,using FP11 improves the throughput by more than 2 times compared with FP16 when the loss of image classification accuracy rate is not higher than 1%.

Keywords/Search Tags:

Artificial Intelligence, Heterogeneous acceleration, General accelerator, Operator fusion, Dataflow optimization

PDF Full Text Request

Related items

1	Research Of Some Problems In Artificial Metabolic Algorithm
2	Artificial Plants Algorithm Design
3	Design Of Heterogeneous Neural Network Accelerator Based On Pruning And Sparsity Optimization
4	Research On A Hierarchical Distributed Caching CNN Accelerator For Enhanced Weight Stationary Dataflow
5	Research And Design Of General Accelerator For Various Clustering Algorithms
6	Design And Verification Of 3D CNN Accelerator Using Reusability Of Data
7	The Convolutional Neural Network Accelerator Research Based On The Tiling Dataflow
8	Research On The Development Platform Of Artificial Intelligence Chip Based On RISC-V And NVDLA
9	Application Research On Artificial Intelligence Algorithm In Credit Rating Of Telecom Operator Users
10	Research On Fusion And Security Communication Of Heterogeneous Mobile Internet Of Things