Font Size: a A A

Research And Optimization Of Neural Network Acceleration Algorithm

Posted on:2021-01-12Degree:MasterType:Thesis
Country:ChinaCandidate:S W NiuFull Text:PDF
GTID:2518306476950309Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
In recent years,the research on neural networks has been intensified,and the models proposed have become more and more mature.However,as the depth of research increases,the scenes that the neural network needs to solve become more and more complex,and the model itself also becomes complicated.The complex model makes the forecasting process gradually take longer,How to accelerate the prediction process of the neural network under the premise that the original network accuracy is basically unchanged is the key content of this study.Efficient pruning,low rank decomposition and hardware acceleration algorithms in the field of neural network acceleration is proposed in this paper.And use four classic neural networks including LeNet5,AlexNet,VGG11,VGG16 to verify the above acceleration algorithm.And on the basis of verifying the acceleration algorithm,the above acceleration method is optimized to improve the acceleration effect of the original acceleration method and further reduce the time consumption of the neural network prediction process.The main work of this article is as follows:1.This paper first studies the efficient pruning algorithm based on the first-order Taylor expansion pruning standard,the experimental results based on multiple task scenarios show that the pruning algorithm can effectively cut out redundant parameters in the original network and improve the calculation efficiency of the network.In the optimization of the acceleration algorithm,two improvements are proposed:(1)According to the principle of pruning,after the model is pruned,the optimization algorithm used when re-training to restore accuracy changes from a random gradient descent method to a momentum optimization algorithm to speed up the recovery of the model.(2)Adjust the original pruning standards on the basis of efficient pruning.Compared with the original algorithm,the optimized pruning algorithm has the accuracy of the neural network retained by the new pruning standard is 10% higher than that of the original pruning standard under the premise that the pruning percentage is 30%,and the prediction time is further reduced.2.This paper also studies two accelerated algorithms: Canonical Polyadic Decomposition and Tucker decomposition.The experimental results based on multiple task scenarios show that the two low-rank decomposition algorithms can improve the calculation efficiency and reduce the prediction cost based on retaining the effective information in the weight tensor Time.On this basis,a combination of Tucker decomposition and optimized high-efficiency pruning algorithm is proposed to develop a new fusion acceleration algorithm.Experimental results based on multiple task scenarios show that the fusion acceleration algorithm can reduce the model prediction time consumption to a greater extent than the single Tucker decomposition algorithm and pruning algorithm,while retaining the accuracy of the original model to a greater extent.3.This paper studies the hardware acceleration algorithm of neural network.The image recognition experiment based on the Zedboard development board shows that compared with the network running on the ARM processor,accelerating the calculation process of the convolutional layer of the convolutional neural network on the FPGA can effectively improve the operating efficiency of the network and greatly reduce the consumption of prediction Time.And on this basis,it is proposed to perform fixed-point quantization on the parameters.Compared with the original hardware acceleration method,it saves hardware resources and further improves the calculation efficiency.
Keywords/Search Tags:neural networks, efficient pruning, canonical polyadic decomposition, tucker decomposition, hardware acceleration
PDF Full Text Request
Related items