Font Size: a A A

Optimizing Deep Learning Computation On Mobile Devices

Posted on:2020-09-27Degree:MasterType:Thesis
Country:ChinaCandidate:T L ZhaoFull Text:PDF
GTID:2428330575488975Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Recently,Convolutional Neural Networks(CNN)based Deep Learning algorithms are attracting more and more attention in the field of both industry and academy,for their superior performance in many tasks such as image classification,object detection,natural language processing and so on.With the constantly emerging of new tasks and needs,their application scenarios are more and more broad.However,to deal with more and more complex tasks,and to meet the requirements of higher and higher accuracy,most modern deep learning models are being increasingly wider and deeper,memory and computation complexity of inference of deep learning models are thus unceasingly increased.At the same time,the demands of applying deep learning based algorithms on low-ended mobile devices,are continuously improving,while on these devices,memory,power and computational resources are largely limited,which makes it difficult to deploy deep learning based algorithms on mobile devices,and becomes the main bottleneck limiting the application scenarios of deep leaning based algorithms.To solve this problem,in this paper,we redesign the memory management strategy and computation flow of Binary Neural Networks,compared to existing methods,BitStream can run several times faster.What's more,in this paper,we also propose and implement an algorithm for efficient inference computing of quantized neural networks,and it's several times faster than floating-point networks.To make these algorithms more practical,we design and implement a light-weight framework for efficient inference of deep learning models on mobile devices,named QEngine.It's several times faster than other frameworks.
Keywords/Search Tags:deep learning, convolutional neural networks, quantized neural networks, binary neural networks
PDF Full Text Request
Related items