Font Size: a A A

Research On The Deployment And Acceleration Of Recurrent Neural Network On Embedded Devices Based On TVM

Posted on:2019-12-11Degree:MasterType:Thesis
Country:ChinaCandidate:H Y BaoFull Text:PDF
GTID:2428330548479816Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous development of machine learning,a series of unprecedented success has been achieved in the field of deep learning.Deep learning algorithms have been applied to all fields in the world.But as the size of the network has grown tremendous,there has been a huge demand for computing power.Most deep learning systems such as Tensorflow,MXNet,Caffe,can only provide server-level GPUs optimization,that makes it difficult to deploy effectively on some under-resource devices.Therefore,in the field of embedded computing,related researches are carried out successfully.The Recurrent Neural Network,as an important part of deep learning with more computationally intensive than convolution neural networks,is more hard to be deployed efficiently.This thesis summarizes the research status of related works and proposed a schema of optimizing Recurrent Neural Network computing grraph to accelerate the execution based on the open source framework of TVM and NNVM.The main work of this paper is as follows:1)Introducing recurrent neural network,elaborating the working principle of NNVM and TVM.It focuses on the calculation flow of three special categories(RNN,LSTM,GRU)of recurrent neural networks.2)In the framework of NNVM and TVM,a Recurrent Neural Network optimization schema is designed.The graph optimization under NNVM and operator symbol optimization under TVM.3)The comparison experiment conducted in PC show that under the optimization scheme,the computation of Recurrent Neural Networks are 80%faster than using the MXNet framework.4)We implement this neural network for specific embedded devices Raspberry Pi and smart phone for Example...
Keywords/Search Tags:Deep learning, Recurrent Neural Networks, Embedded devices, Acceleration
PDF Full Text Request
Related items