Font Size: a A A

Communication-Efficient Computation Load Scheduling For Coded MapReduce

Posted on:2020-01-03Degree:MasterType:Thesis
Country:ChinaCandidate:M H ZhaoFull Text:PDF
GTID:2428330572967276Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years,edge computing has become one of the most attractive emerging techniques in the next-generation network,providing a powerful platform for the realization of many novel computation-intensive applications,e.g.,virtual reality,autonomous driving and smart grid con-trol.This poses demanding requirements for the quality of computation experience,which cannot be easily satisfied by solely relying on upgrading to more powerful servers.With the cooper-ation among edge servers,computational tasks can be decomposed using the MapReduce-type frameworks,which are well-known distributed computing frameworks for the development of s-calable parallel applications for processing big data on a large number of servers.These real-time computation-intensive applications usually require low-latency response from the edge computing servers.However,the communication load among the edge servers becomes the bottleneck for further improving the performance of edge computing.And the edge computing resources change dynamically with time,how to make full use of dynamically changing redundant computing re-sources to improve computing performance,how to minimize the communication load among the edge servers.Motivated by this,this thesis presents some research on the communication-efficient computation load scheduling for coded MapReduce.Firstly,we develop a communication-efficient computation load scheduling framework to complete the computation tasks with coded MapReduce considering the intrinsic tradeoff between the communication and computation loads.Our goal is to minimize the communication load un-der time-varying excess computing resources.In the offline setting,the full information about the available computational resource is known in advance.In this case,there is a complicated in-terdependence between the tasks and the computation load,giving rise to a difficult optimization problem.By decoupling the computation load to the tasks and the computing repetition factor,we transform the offline problem to a convex optimization task scheduling problem.We obtain the optimal computation load scheduling algorithm by adopting the augmented Lagrangian method.In the online setting,We show that the competitive ratio of any online algorithm for the minimiza-tion of the total communication load is lower bounded by the maximum and minimum value of the computing resource state.We derive a worst-case performance bound of the online equal task scheduling(ETS)algorithm by competitive analysis.We show that competitive ratio is bounded above by the maximum and minimum values of the computing resource.We make full use of past state information of computing resource for pre-planing and propose an improved algorithm based on the ETS algorithm in a learning manner.Secondly,we develop the general case a stochastic communication-efficient computation load scheduling framework to complete the computation tasks with coded MapReduce considering ar-bitrary arrival and strict delay constraints over time-varying computing resource.In this case,there is a complicated interdependence between the tasks and the computation load,giving rise to a difficult optimization problem.By decoupling the computation load to the tasks and the comput-ing repetition factor,we transform the offline problem to a convex optimization task scheduling problem.And through the analysis,the optimal solution has the property of dynamic water-filling.Next,the optimal solution property is used to find the solution-plane.Finally,we obtain the com-putation load scheduling algorithm under offline conditions.Guided by the optimal algorithm,we propose a dynamic online algorithm based on the predicted computing resource.Finally,We built a computation load algorithm verification platform for coded MapReduce framework,which consists of computing resource initialization,coded MapReduce initialization,and computation load scheduling module.It fully implements the coded MapReduce framework and computation load scheduling capabilities.The computation load scheduling algorithm in this paper is applied to the commonly used test task TeraSort in distributed computing.The final simulation results show that the computation load scheduling algorithm proposed significantly improves the computation performance of the coded MapReduce.
Keywords/Search Tags:edge computing, optimize communication load, computation load scheduling, offline algorithm, online algorithm
PDF Full Text Request
Related items