Font Size: a A A

Optimization Of Hadoop Scheduling Algorithm Based On Prediction

Posted on:2017-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:X H YuanFull Text:PDF
GTID:2348330503489807Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
When the Hadoop system has backward tasks, existing Hadoop speculation execution task scheduler will perform the backup task in idle nodes, and they did not comprehensively consider the current performance and reliability of the idle node. So it may cause that the backup task still fail, or run in a slow way. This situation not only leads to the higher failure rate of backup tasks, but also ocuppies too much system resources and prolong system response time. Therefore, it is important for improving system performance and resource utilization to research the existing Hadoop scheduler to find their shortcomings, and then to give a improve algorithm.This paper proposed CPL(Computation Prediction of Late) scheduling algorithm,which is based on prediction of Hadoop speculative execution scheduling algorithm. The CPL scheduling algorithm mainly contains two optimization points. First, according to the failure rate of task queue in descending order, there are two queues such as CPU idle node queue and I/O idle node queue in the system. Based on the matching task and the load of the node types, the optimized Hadoop scheduler predicted the task rest completion time,and choosed the node to backward task for failure rate as low as possible and performance as much as stable. So it avoid the backup task "jitter" and reduced the system response time.And CPL scheduling algorithm also decreases the waste of system source. Second, CPL scheduling algorithm modifies the current task classification algorithm by using the sum of the Map time fragments occupied by the CPU task, and a more accurate classification method is proposed.The performance of CPL scheduling algorithm is verified by the simulation platform of CloudSim cloud computing simulation platform. The results show that the job response time of CPL scheduling algorithm is reduced by 20% and 14% respectively compared to the FIFO scheduling algorithm and LATE scheduling algorithm. To campare with LATE scheduling algorithm, CPL scheduling algorithm can decline the failure rate of backward Task by 16%.
Keywords/Search Tags:Hadoop, Backward Task, Task Scheduling, CPL Algorithm
PDF Full Text Request
Related items