Font Size: a A A

Research On Load Balancing Algorithm For Scheduling Based On Hadoop

Posted on:2017-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:S J WangFull Text:PDF
GTID:2308330485989508Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the advent of the Web2.0, the rapid development of the Internet, the more people are dependent on the network. Especially the "Internet+" is put forward, the major companies have answered the call to implement restructuring. All walks of life can generate huge amounts of data every day, and there are an explosion. As the demand for massive information storage and new computing power, there is to promote a new model of computing-Cloud computing. The Hadoop, one of cloud platform, is a open-source platform that is capable of processing big data distributed, implementing the MapReduce programming model, and is preferred by many scholars for big data research. An important component of Hadoop is the scheduler. It is mainly to achieve a reasonable allocation of system resources and the job scheduling, so the pros and cons of the scheduling algorithm for the performance of the cluster have a vital role. Therefore, the study of Hadoop scheduler and the algorithm is of great significance.In this paper, through studying the existing algorithm of Hadoop platform, analysizing the LATE scheduling algorithm principle and advantages and disadvantages of its select backup tasks and the execution node in a heterogeneous environment, an improved IR-LATE scheduling algorithm is proposed. Firstly, the cluster job is categorized by IR-LATE according to the work load. When select node for backup execution, through IR-LATE we choose the slow task of the longest remaining time to complete and the most in need to start backup and select the optimal node to execute. Finally, we do experiments to compare the IR-LATE algorithm with the LATE algorithm. The results show that the IR-LATE algorithm not only improves the slow task of judging the correctness, but also shorten the average running time jobs, improved load balancing cluster.
Keywords/Search Tags:cloud computing, Hadoop, MapReduce, LATE, load balancing
PDF Full Text Request
Related items