Research And Implement Of Job Scheduling Method For Multi_User MapReduce Clusters

Posted on:2011-10-23

Degree:Master

Type:Thesis

Country:China

Candidate:K Wang

Full Text:PDF

GTID:2178330338489889

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

The current data intensity computation needs to process the PB level data set and the GB level data stream, facing the large-scale data management, the complex computation environmental management, scalable computing platform problems. Hadoop is a kind of scalable distributed computing architecture which can combine a lot of inexpensive PCs to provide super computing, It's Map-Reduce parallel computing framework prepare an easy programming model for users.This paper in-depth analysis of the existing Hadoop cluster's job scheduling approach,then we in-depth research the problem of poor data locality that caused by the existing methods of multi-user job scheduling. For the existing scheduling algorithms of Hadoop can not get good data locality, we achieve a waiting time-based scheduling method, which give priority to scheduling task to node where the required data been stored, so can achieve better data locality, effectively reduce the IO overhead in calculation process, to achieve purposes of increasing system throughput and reducing the average response time of a single work.To verify the validity of the method, we give the design and implementation for our proposed scheduling method and verified by experiments. The results show that the method not only guarantees multi-user's fair share cluster,and the data locality of the node has been greatly improved, increase the throughput of the cluster system effectively, effectively reducing the average response time of a single job.

Keywords/Search Tags:

Distributed Computing, MapReduce, Hadoop, Job Scheduling, Waiting Scheduling, Priority, Multi-user Shared

PDF Full Text Request

Related items

1	Research And Implement Of Job Scheduling Method For Multi_user Mapreduce Clusters
2	A Priority-based Scheduling Algorithm For Hadoop
3	Research On Hadoop Cluster Scheduling Optimization
4	Design Of Mapreduce Task Scheduling Algorithms In Heterogeneous Hadoop Cluster
5	Research On Scheduling Algroithm In Hadoop Mapreduce
6	The Research Of Hadoop Scheduling Algorithm And Improvement Strategy
7	Research On Optimization And Improvement Of MapReduce Job Scheduling Algorithm
8	The Research And Implementation Of Hadoop Scheduling Algorithm
9	An Optimized MapReduce Workfow Scheduling Algorithm For Heterogeneous Computing
10	Research On Algorithm Analysis And Modificating Of Job Scheduling For Hadoop