Hadoop Scheduling Algorithm Based On Job Classification And Cost Comparison

Posted on:2016-04-11

Degree:Master

Type:Thesis

Country:China

Candidate:Z Li

Full Text:PDF

GTID:2308330479493941

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Hadoop has been welcomed and widely used by many institutes and scholars as an excellent open cloud platform on the process of Cloud development. The scheduler of Hadoop is the key factor of Hadoop’s performance. This paper is aimed at founding a well-performed scheduling algorithm by researching the merit and demerit of the some existing classic scheduler. At last this paper proposed a scheduling algorithm which can reduce the job’s running time on Hadoop. This algorithm is based on job classification and cost comparison. In the end the paper will test the new algorithm’s performance.Current classical schedulers have their own merits. For example, Job Queue Task Scheduler is simple, low overhead and Fair Scheduler gives consideration to both big and small jobs. However, these schedulers have not given consideration to the processor’s performance and memory’s size on nodes. If the computer-intensive job was scheduled to the nodes with high frequency processors, the job’s runtime may be cut down. If the memory-intensive job was scheduled to the nodes with big memory, the job’s task may be more unlikely killed. When a job is scheduled to a node which doesn’t contain the job’s input data, the node has to copy the input data from other nodes. The copy process will cost more time than the other situation. When a job is about to be scheduled to a node which doesn’t contain the job’s input data, the algorithm in this paper will predict the time of copying data, queuing and running on this node(un-local scheduling time). At the same time this p will predict the time of waiting to next heartbeat and schedule the job to a node which contains the job’s input data and queuing, running on this node(local scheduling time). The algorithm in this paper will make decision according the comparison of un-local and local scheduling time.This paper proposed the job classification and cost comparison scheduling algorithm. This algorithm consists of two child algorithms: job classification algorithm which schedule a job by the node’s type and job’s type; cost comparison which make decision by the cost of un-local and local scheduling when a node doesn’t contain the current job’s input data. The job classification algorithm uses machine learning to make job’s type match Task Tracker’s type. The cost comparison algorithm eliminates blindness in unlocal scheduling. Eventually the two child algorithm can be mixed up organically. This paper did a lot of tests to measure the algorithm’s performance and the result showed that the algorithm works pretty well.

Keywords/Search Tags:

Scheduling, Job Classification, Cost Comparison

PDF Full Text Request

Related items

1	Research On Cost Optimization For Scheduling On Clouds
2	Research On Cost Optimization For Request Scheduling Algorithm In Geo-distributed Datacenters
3	Research On User Scheduling For Low-cost Large Scale Multi-antenna System
4	An Enhanced Cost Efficient Scheduling Algorithm For Dense Wireless Network
5	Grid Task Scheduling Algorithms And Model Based On QoS Constraint And Cost Restriction
6	Research On Scheduling Algorithm For Optimizing Cost-efficiency In Geo-Distributed Cloud Systems
7	Research On Multi-View Classification With Cost-Sensitive
8	Research On Comparison Of Web Frameworks With Application In A Bus Scheduling System
9	Optimal Scheduling Of Lateness Under The Cost Limitation Of Uniform Parallel Machine
10	Task Scheduling Algorithm Based On Improved Particle Swarm Optimization Algorithm In Cloud Computing Environment