Research And Implementation Of Execution Optimization System For Deep Learning Applications

Posted on:2021-09-26

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Yang

Full Text:PDF

GTID:2518306557487414

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In recent years,artificial intelligence has achieved notable success.A technology called deep learning has been proposed,which makes use of deep neural networks to perform pattern recognition and data analysis.Deep neural networks are loosely modelled on the human brain and do well in complex tasks.Many applications such as face detection,machine translation and speech recognition are developed based on deep neural networks.The life cycle of deep learning applications is composed of two phases�training and inference.During training,neural networks learn from data and weights are updated,this process is compute-intensive and takes a long time to complete.Inference is a production phase in which trained models are deployed to predict real world data.The effectiveness of inference is measured by two metrics�accuracy and delay,and it's difficult to achieve high accuracy and low latency simultaneously.To optimize the execution of deep learning application,the following issues should be considered�how to accelerate model training and how to improve inference effectiveness.This thesis revolves around these two issues and main achievements include:Firstly,this thesis proposes a model-aware parallelization strategy for deep neural networks' distributed training,which consists of two steps.The first step is model-profiling,which estimates the size of parameters and output data for each layer using the formula summarized in the third chapter.The second step is strategy-making,which uses the model information collected in the previous step to analyze the time overhead and picks out the best strategy with particle swarm optimization.Secondly,this thesis proposes a task scheduling strategy oriented to heterogeneous requirements of inference task which consists of two steps.The first step is task offloading,in which tasks will be dispatched to a server in consideration of server load,server performance and task deadline.The second step is task scheduling,in which servers will decide for each task which model to use and determine how to order the tasks in consideration of the time sensitivity and accuracy sensitivity of tasks.Finally,this thesis designs and implements a prototype system in SEU Cloud platform,applying the theoretical research results into practice.Experimental results show that the modelaware parallelization strategy proposed in this thesis can reduce time overhead of training and the task-oriented scheduling strategy can improve success rate and accuracy of inference.

Keywords/Search Tags:

deep learning, neural networks, distributed training, inference, task scheduling

PDF Full Text Request

Related items

1	Design And Implementation Of Task Scheduling Subsystem In Distributed Deep Learning Inference System
2	Research On Cloud-edge Joint Task Inference And Model Collaborative Training In Edge Intelligence
3	Research On Cloud Task Scheduling Based On Deep Reinforcement Learning
4	Research On Distributed And Flexible Job-shop Scheduling Algorithm Based On Deep Neural Network
5	Research On Artificial Intelligence For IT Operations In Dynamic Edge Networks
6	Research And Implementation Of Efficient Task Scheduling Technology In Distributed Computing System
7	Research On Distributed Deep Learning Task Assignment Algorithm Based On Blockchain
8	A Study Of Several Highly Complex Problems In Pattern Recognition Based On Modular Neural Networks And Manifold Learning Technique
9	Research On Image Segmentation Based On Multi-task Learning Deep Neural Networks
10	Optimal Design And Implementation Of Distributed Deep Learning Training