Font Size: a A A

Capacity Detection Of Nodes And Task Scheduling Method In MapReduce

Posted on:2014-12-30Degree:MasterType:Thesis
Country:ChinaCandidate:J Y LiuFull Text:PDF
GTID:2308330479979447Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Recent years, cloud computing has run into an explosive development, and become a main stream of IT industry in a short time. MapReduce is a programming model and widely used in distributed computing environment. Due to the synchronization phase of all Map function and Reduce function in MapReduce may result in the long tail problem, the resource utilization of some computing nodes is badly degraded, even though the very good scheduling algorithms. The main task schedulers in MapReduce are designed in job level, and those schedulers in task level could only take actions after detecting slow tasks. Schedulers exists now treat computing nodes as homogeneous. In fact, considering the differences between computing nodes, we could get a better scheduling result, and improve the long tail problem.This paper analyses the essential reason of long tail problem happening, which to be the existing of semi-failure nodes. This paper proposes a distributed detection algorithm for semi-failure node to directly detect those nodes. We put forward an interactive method to obtain the capacity level relative to other nodes through the interaction of information between nodes. Based on the capacity of nodes, we discuss an scoring method for computing nodes, and finally find out the semi-failure nodes.This paper put forward optimization of method in task scheduling bases on the evaluation of computing nodes. We predict the future capacity of computing nodes in the process of job running, and predict the end time of job in the best scheduling. With the end time of job determined, we assign remain tasks according to the capacity of computing nodes. Through simulate experiment, we decease the occurrence of long tail problem, increase the utilization of resources and decrease the executing time of jobs.
Keywords/Search Tags:MapReduce, semi-failure nodes, long tail problem, task scheduling
PDF Full Text Request
Related items