Font Size: a A A

The Research Of Task Scheduling Algorithm For Mapreduce Framework In Cloud Environment

Posted on:2014-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:W M ZouFull Text:PDF
GTID:2248330398967120Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Cloud computing is a new computing model or service model,mainly throughthe Internet to provide users with a variety of services for computing and storage. It israpidly developed with promoting by the major domestic and foreign researchinstitutions, and provides a good solution to process a huge amount of data in theapplication platform of Internet. Google proposed the MapReduce parallel computingmodel for the concurrent processing of massive data in2004, which cloud computingplatforms are emerging based on the MapReduce model, and Hadoop is the mostwidely used. The task scheduling problem based on the MapReduce model becomethe hot issues to the scholars. The efficient task scheduling algorithm can improve theperformance of the cloud computing system, which can make good use of theresources in the cloud computing centers, and the most important is that it has greatsignificance in processing huge amounts data of the cloud computing platform.At first, the dissertation describes the background of cloud computing and relatedtechnologies, and it mainly analyzed the MapReduce parallel computing model andthe implementation process; Secondly, we introduce the new features of theMapReduce framework and several detailed typical task scheduling algorithms, suchas FIFO algorithms, MaxCover-Balance algorithm,fairness algorithm, delay schedule-ing algorithm, genetic algorithm and so on.Take account of the weaknesses of those algorithms, we proposed two taskscheduling algorithms. Firstly, an algorithm was proposed to improve the deficienciesin the current delay scheduling algorithm, and it is able to optimize the performanceof the algorithm and adjusts the jobs’ waiting time threshold dynamically according tothe information of the variable factors in the dada center, such as the dynamicparameters of the node idle rate, network transmission rate, and so on; Secondly,another algorithm(CSGA, Consumer Satisfaction Genetic Algorithm) was proposed toimprove on the current Adaptive Genetic Algorithm(AGA). Under the promise of guarantee consumer fairness, CSGA scheduling tasks to the node with data block ofthis tasks in order to reduce data translation cost, which arms to shorten all the taskcompletion time and tries hard to improve the consumer satisfaction.Finally, in order to compare the improved algorithm with the original algorithm,we simulate cloud computing environment by Matlab and set the experimentalparameters. Through the repeated experiments, the results show that the improveddelay scheduling algorithm outperforms the previous delay scheduling algorithms interms of the job response time and load balance of the node; The improved geneticalgorithm have less response time and higher satisfaction than the adaptive geneticalgorithm, which is better adapted to the cloud computing environment.
Keywords/Search Tags:cloud computing, MapReduce computing model, task schedulingalgorithm, data location, fairness~1
PDF Full Text Request
Related items