Font Size: a A A

Research On Resource Allocation And Scheduling In Hadoop YARN

Posted on:2016-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y Z LiFull Text:PDF
GTID:2308330479476622Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Hadoop is a distributed storage and parallel computing framework. Due to the characteristic of high reliability, expansibility and fault-tolerance, Hadoop has been widely used in Cloud computing. As resource allocation and scheduling problem is always a very important topic in parallel computing field, Hadoop YARN, a resource management system, provides three common resource schedulers. Howerver, with the increasing use of Hadoop, these built-in schedulers can not meet the needs of users. Consequently, it is of great importance in studying how to allocate resource reasonably, to improve system performance and reduce-costs.This paper studies in depth the resource allocation and scheduling mechanism of Hadoop YARN. As to job scheduling and task scheduling, it respecitively studied the resource scheduling mechanism and speculative execution mechanism. In view of the existing problems of unreasonable resource allocation and inaccurately calculating task execution time, the paper proposes solutions. The details are followings:1) Research on how to rationally allocate resources, proposing a self-adapt resource scheduling algorithm based on ant colony optimization and particle swarm optimization. In this paper, it achieves attribute information, including load, memory, and CPU speed through heartbeat message transfer mechanism, to initialize pheromone matrix. It also introduces the self-cognitive ability and social cognition ability of particle swarm algorithm into ant colony algorithm. Meanwhile, it dynamically adjusts pheromone evaporation rate based on the fluctuation trends of global optimal solution in ant colony algorithm. It designs a resource allocation and scheduling algorithm, and realizes a new scheduler to verify the effectiveness of the algorithm proposed in this paper. Experimental results show that this method can not only efficiently allocate Hadoop resources, but also shorten jobs execution time.2) Research on how to accurately predict task execution time, proposing a speculative execution mechanism based on C4.5 decision tree algorithm. In this paper, it firstly analyzes the importance of accurately calculating speculative task execution time and backup task execution time, after that, it introduces the existing problems in traditional Hadoop scheduling algorithm. On these basises, it proposes a speculative execution mechanism based on C4.5 decision tree algorithm, and designs a speculative execution algorithm to calculate task execution time. It then compares the execution time of speculative task with backup task, judging whether it is necessary to start the backup task, to avoid unnecessary start backup task and shorten the job execution time. Hadoop source codes are modified to realize the algorithm. Experiment results show that, in terms of both job execution time and run stability, the proposed algorithm is obviously superior to other speculative execution algorithms.
Keywords/Search Tags:Ant Colony Optimization, Resource Scheduling, Speculative Execution, Hadoop
PDF Full Text Request
Related items