Font Size: a A A

Research On Optimization And Scheduling Strategy Of Neural Network Distributed Deployment

Posted on:2024-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:Z R LinFull Text:PDF
GTID:2568307079976459Subject:Electronic information
Abstract/Summary:PDF Full Text Request
In recent years,the development of neural networks has been changing rapidly,but with the increase of network depth,the amount of calculation is also increasing,which poses a challenge to the deployment of deep neural network(DNN)models on the terminal.Distributed deployment of DNN combined with edge computing is a feasible way to solve this problem.In this method,the DNN model is divided into three parts and deployed on nodes at three levels: terminal,edge and cloud.This makes the terminal and the edge can run some part of DNN,which not only utilizes the resources of the terminal and the edge,but also alleviates the load of the cloud.On the basis of this method,this thesis conducts a further exploration,and the specific contents are as follows.First of all,this thesis proposes an algorithm that can select the position of the early exit point according to the idea of early exit in the DNN distributed deployment method.This algorithm takes the DNN layer as the basic unit,considers the calculation amount,data amount,calculation time,network communication time,inference accuracy of each layer,and continuously updates the position of the layer with the least overhead in iterations through a greedy strategy.A system that deploys DNNs hierarchically based on the resulting least-overhead locations achieves lower inference latency than purely end-to-end systems.In order to further improve the overall accuracy of the system,this thesis also proposes a method of ”separate training for each branch” which has better performance than the original ”weighted training of each exit point loss function value” method in terms of inference delay and accuracy.Second,in order to optimize the performance of a single device,this thesis proposes a single-node scheduling architecture.This architecture includes a scheduling strategy and two optimization strategies.The core idea of the scheduling strategy is to combine each DNN task into a task group according to the model it uses.The data of the same task group can be spliced into a large tensor for unified calculation.Task groups are scheduled in a priority manner,and a task group with more tasks and short execution time has a higher priority.The two optimization strategies are to upload the DNN tasks in the queue in advance when the device is overload,and to package and send the network transmission data of the same target together.This scheduling architecture has lower average delay and higher throughput than FCFS scheduling and Sufferage algorithm when the device resources and performance cannot keep up with the task creation speed.Finally,aiming at the selection of edge nodes,this thesis proposes an inter-node scheduling strategy,which uses a node load evaluation model and a node load prediction model based on LSTM to select edge nodes in two stages.The first stage uses the load evaluation model to find the edge node with a light load in the recent period as the initial edge node of the terminal.The second stage is that if the initial edge node load is overload during the actual upload of DNN inference tasks,the terminal can reselect an edge node with a lighter load in the future for subsequent inference according to the load prediction model.This strategy has higher throughput than general random strategy and polling strategy.
Keywords/Search Tags:Neural Network, Distributed Deployment, Resource Scheduling, Early Exit
PDF Full Text Request
Related items