Font Size: a A A

SLA-based Adaptive Job Scheduling In Heterogeneous Hadoop Clusters

Posted on:2020-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:X M DuFull Text:PDF
GTID:2428330590479439Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As an open source distributed computing and storage software,Hadoop has become the most popular big data processing platform.With the maturity of cloud computing technology,more and more cloud service providers begin to use Hadoop platform to provide cloud services.In order to ensure the interests of cloud service providers and customers,the two parties sign a Service Level Agreement(SLA),and the cloud service provider must follow the SLA,otherwise they will be punished for breach of contract.At the same time,due to the continuous expansion of cloud applications and the rapid increase in the amount of network data,machines with different performances are added to the cluster as computing nodes to form heterogeneous clusters.Hadoop's built-in scheduling algorithm does not take heterogeneity into account,resulting in low efficiency of job execution in practical applications and failure to complete jobs within SLA deadlines.Therefore,how to improve resource utilization,reduce job execution time and avoid SLA violation in heterogeneous clusters has become an urgent problem to be studied and solved.In view of the above problems,based on heterogeneous Hadoop 2.0 cluster,this paper studies SLA-based adaptive job scheduling in heterogeneous Hadoop cluster from several aspects,such as job performance prediction,job scheduling and resource allocation.Its main work and achievements include:1.Aiming at the problem that submitted jobs may not satisfy SLA,an adaptive scheduling mechanism and job performance prediction model based on SLA perception are proposed.The mechanism pre-processes the submitted jobs through the job performance prediction model.The proposed job performance prediction model adapted to Hadoop 2.0 takes into account the change in the number of reduce tasks,and the local weighted linear regression technique is used to predict the execution time of each stage of the MapReduce job.The predicted job execution time is compared with the deadline to decide whether to submit the job to the cluster waiting for scheduling.The experimental results show that the accuracy of job performance prediction model is more than 95%.When SLA-aware adaptive scheduling mechanism is adopted,the violation rate of SLA decreases by 30% in terms of deadline.2.Because Capacity scheduling algorithm does not consider SLA and is not suitable for heterogeneous clusters,it is improved from job selection and resource allocation.An SLA-based job scheduling and resource allocation algorithm(SLANCLDR algorithm)is proposed.The algorithm dynamically calculates the job weight to determine the priority for the job by introducing the relevant parameters of the SLA and the real-time situation of the job;and allocates resources for the task,according to the different computing power,job type and data locality of different nodes in heterogeneous environment.The experimental results show that the algorithm can improve the resource utilization of the cluster and effectively reduce the execution time of the job.From the experimental results of the job performance prediction model accuracy,job SLA violation rate,the cluster resource utilization rate and the job execution time,it can be seen that the research work in this paper is feasible and effective.
Keywords/Search Tags:Hadoop, Service level agreement, Performance prediction, Heterogeneous cluster, Resource scheduling
PDF Full Text Request
Related items