Font Size: a A A

Research On Job Scheduling Method Under Hadoop Platform

Posted on:2016-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:W L ChenFull Text:PDF
GTID:2208330461483049Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Cloud Computing has become more and more important in the Big Data areas such as web search and data mining, Hadoop has also been widely used as en open source cloud computing platform. Job scheduling algorithm is the key component of Hadoop, a good job scheduling algorithm can improve both the computation speed and computing resources utilization.In this paper, we firstly give detailed introduction about the theory and structure of Hadoop, then we introduce two job scheduling algorithms contained in Hadoop, FIFO scheduler and Fair scheduler. In order to quantify the difference between job schedulers, we design a method to calculate the cluster load balance factor and job schedule fairness factor. Based on the methods, we compare the different behaviors between FIFO scheduler and Fair scheduler in a heterogeneous cluster with several experiments, and analyse the disadvantages of the former two schedulers.According to the characteristics of heterogeneous clusters and the good job schedule fairness of Fair scheduler, we design a Real Time Status based Scheduler(RTSS). RTSS guarantees each job can get its computing resources based on its size and priority, RTSS can also dynamically adjust the number of computing resource on each node according to its load status. We have implemented RTSS algorithm in the cluster, compared to the results of FIFO scheduler and Fair scheduler, we can draw the conclusion that RTSS can improve the cluster load balance without damage the job execution speed and job schedule fairness.
Keywords/Search Tags:Cloud Computing, Hadoop, job schedule, heterogeneous cluster, real time scheduling
PDF Full Text Request
Related items