Font Size: a A A

Research On Speculative Task Schedule Strategy For Hadoop Platform

Posted on:2017-11-28Degree:MasterType:Thesis
Country:ChinaCandidate:Z MeiFull Text:PDF
GTID:2348330512462134Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As a supplement of job scheduling, speculative task scheduling would execute a backup for slow task in order to shorten the completion time of job. The hadoop original speculative schedule algorithm assumes the cluster homogeneous that it performances bad in heterogeneous environment. The computation of task progress and the judgment of slow task of Longest Time To End algorithm are not precise in heterogeneous environment, the judgment of slow node is also inaccurate and real-time load hasn't been considered while choosing the backup task executing node. These shortages has seriously influenced the service quality of applications, they are the challenges of big data processing based on hadoop.Aiming to solve these problems, the article proposed an improved speculative task schedule algorithm. The algorithm uses historical and recent progress proportion of each stage of task to improve the way that task progress be calculated, and then judges the slow map and reduce task separately through the progress increase rate on the basis of it, the algorithm also uses task executing speed and real-time load to filter backup task executing node at last. The revised speculative task schedule algorithm takes the heterogeneous environment, speculative strategy, task difference and real-time load of node into consideration, therefore it not only shortens the job completion time but also improves the performance of hadoop cluster.We conducted experiments of improved algorithm that we proposed as well as hadoop original and Longest Time To End speculative schedule algorithms on cluster that been set up, and then compared and analyzed the job completion time of them, the experiment results verified the efficiency of proposed algorithm. This work is helpful to the research and improvement of current problems that hadoop job scheduling facing.
Keywords/Search Tags:Hadoop, Job Scheduling, Heterogeneous Circumstance, Speculative Scheduling, Real-time Load
PDF Full Text Request
Related items