Font Size: a A A

The Research On High Performance Task Scheduling Technology Based On Mapreduce In Cloud Computing

Posted on:2014-12-14Degree:MasterType:Thesis
Country:ChinaCandidate:S Q XuFull Text:PDF
GTID:2268330392973481Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Since first proposed in2006, cloud computing has become a persistent hot spotin the IT industry, and it has a high commercial value. Google’s MapReduce cloudcomputing programming model is used to achieve automatic parallel and distributedcomputing on large data sets, and it has high-performance and scalable, and it hasbeen applied in the Google App Engine cloud platform and open source cloudplatform Hadoop. This dissertation studies cloud computing task scheduling algorithmbased on MapReduce, and improves task scheduling algorithm in job fairness,network utilization, system throughput, and applies the improved algorithms to thecommunity self-service health kiosk project.This dissertation proposed a improved Weighted Round-Robin task schedulingalgorithm based on the traditional weighted round-robin scheduling algorithm. Theimproved algorithm adopt a mechanism to improve data locality of jobs in the case ofbasically keeping good fairness as traditional Weighted Round Robin. The local taskpriority mechanism adjusts the scheduling order to priority scheduling local tasks inevery cycle, thereby the percentage of local job tasks is increased. Job data localityimprovement will reduce the total completion time of all jobs, so that the algorithmimproves the system throughput on the basis of ensuring certain job fairness.This dissertation studies the factors that affect the locality of the job data, andpresents a data locality improved cloud computing task scheduling algorithm based onthe first-in, first-out algorithm. Taking into account that the probability of localscheduling in the process is gradually reduced along with the reduction of remainingtasks. The algorithm put the jobs that the local scheduling probability is lower thanthe threshold into another queue, and the local tasks in the queue will be priorityscheduled to improve the execution proportion of local tasks, and as a result the inputdata transfer time is saved, thus improves the throughput of the system and networkutilization.This dissertation adopts Hadoop cloud computing platform, and performs theproposed task scheduling algorithms in the health kiosk project. The experimentalresults show that both the proposed improved Weighted Round-Robin task schedulingalgorithms and a data locality improved cloud computing task scheduling algorithmschedule about15%more local tasks than the original algorithms, and the proposed algorithms also reduce certain amount of the total job completion time. The improvedWeighted Round-Robin scheduling algorithm achieves a good job fairness. Thereforepressure on the network of cloud platform in the health kiosk project is relieved, andthe system throughput is improved, and also the response time of the applicationservices is reduced, thereby the quality of service of the project is improved.
Keywords/Search Tags:MapReduce, Fairness, Task Scheduling, Data Locality, Hadoop
PDF Full Text Request
Related items