Font Size: a A A

The Research On Utility Function Based Fair Scheduling In Data Analysis Cluster

Posted on:2018-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:C C HanFull Text:PDF
GTID:2428330512498180Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the popularity of resource management frameworks like YARN and Mesos,clusters can run different data analysis applications.However,the fair share given to each job may not satisfy its resource requirement to complete before deadline,espe?cially during overloaded period.Besides,a single deadline index is too arbitrary to schedule jobs.Therefore,we introduce the time-utility function(or TUF)to indicate the profit,namely utility,added to the system when a job is completed.Utility of a job will decrease near or after its deadline and the scheduling objective will be maximizing accrued utilities.We assume that several job queues have been classified and each job given its TUF.When a slot is free,the scheduler decides a queue to accept and then launch a task on it.This problem is NP-hard because scheduling jobs with step-shaped TUF,a special case of our problem,has been proved to be NP-hard.Dynamic-weighted scheduling(or DWS),a heuristic algorithm with time complexity of o(n3),is proposed based on two principles:1)adjusting weights of different job queues dynamically ac-cording to system load,2)job sequencing in an individual queue is based on a classified greedy algorithm.We find out that DWS promotes the accrued utility by about 1.5x to 2x of FIFO fair scheduler and prevents number of jobs missing their deadlines by nearly 10%compared to a well-performed DAS A algorithm.Moreover,DWS can be easily realized as a plug-in component to the current fair sharing schedulers.In addition,we consider the problem of multiple resource scheduling in TUF based data analysis data center.As deadline objective is often disobeyed due to fluctu?ant system load and improving throughput can relax the load of data center,we design an appointment-utilization scheduling(or AUS)named heuristic algorithm that com?bines resource placement,TUF maximizing and fairness objective together.Firstly,streaming jobs are appointed free resources in the system.Then an online filter choos-es jobs under the constraint of TUF and fairness factor.From the pool of chosen jobs,the best resource-placement task is scheduled based on a dot product function.This algorithm can better solve the problem of multiple resource scheduling with all three objectives flexibly.The final evaluation results show that AUS algorithm can not on-ly improve CPU resource utilization by about 15%compared to the original Capacity Scheduler in the YARN framework when system is overloaded,but also allocate CPU resources in sync with the network resources which will promise reasonable remain-ing resource space for other jobs.Besides,the CDF of completion time of interactive jobs reveal that AUS can improve the deadline objective by 30%in YARN if 2s is considered the deadline of interactive jobs.
Keywords/Search Tags:data center, time-utility function, real-time, deadline, fairness, throughput, utilization
PDF Full Text Request
Related items