Font Size: a A A

Research On The Optimization Scheme Of Inference Stage Based On Model Partition Under Edge Intelligence

Posted on:2022-08-23Degree:MasterType:Thesis
Country:ChinaCandidate:Z G XuFull Text:PDF
GTID:2518306560455684Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of the Internet of Everything,edge computing,as a new computing model,can make up for the shortcomings of traditional cloud computing models that are difficult to cope with the large amount of data generated at the edge of the network and increasing latency requirements.In recent years,with the promotion of the third wave of artificial intelligence,applications based on deep neural networks have been increasingly used in industry and society.The collision of edge computing and artificial intelligence has produced "edge intelligence." In edge intelligence,accelerating model inference has always been a research hotspot.As a novel technology,model partition can effectively reduce the inference time cost of deep neural networks.However,the use of model partition technology in the scene is facing two difficulties: one is using model partition on different neural network models,and another is using it in complex edge computing environment.In this article,we first study the specific methods of using model partition in different neural network models,and then study the use of model partition technology in complex edge computing scenarios to reduce the time cost of deep neural network tasks.The main work includes the following points:1.For the deep neural network model,we first consider the optimal model partition method of pure linear neural network model,and then consider the model partition method for the subgraphs.We use the S-T minimum cut idea of graph to process the subgraphs in the neural network.Finally,a model partition method considering subgraphs named Neurosurgeon of Subgraph Considered(NSC)is proposed.Through the verification on different neural networks,we get that the proposed model partition method can achieve better results than the partition method which only calculates on the terminal device,only calculates on the server and does not consider subgraphs.2.For the complex edge computing environment,we consider heterogeneous edge servers,multi-device,multi-task scenarios,and establish the mathematical model to optimize the average task time cost of all tasks on the premise of using model parition technology.In the mathematical model,the task waiting queue and the task waiting time on the edge server are specially considered.Then the Partition Points Selection(PPS)algorithm is proposed to reduce the solution space,and then use the greedy strategy and progressive search method and finally propose a joint task allocation and model partition algorithm Greedy Strategy for the Progressive Inference(GSPI).3.Considering the real scene,we proposed a practical online algorithm based on GSPI.We use actual machine values as the initial data for our simulation,and finally verify that our proposed framework and algorithm can save more than 30% of the time consumption than the traditional model.In summary,the NSC algorithm proposed in this paper has a good effect on the partition of deep neural network models containing subgraphs.The GSPI algorithm and corresponding online algorithms have excellent performance for model partition and task allocation in complex edge situations.
Keywords/Search Tags:Edge Computing, Edge Intelligence, Model Partition, Task Allocation
PDF Full Text Request
Related items