Font Size: a A A

Research And Improvement Of Task Scheduling Algorithm In Hadoop

Posted on:2013-12-05Degree:MasterType:Thesis
Country:ChinaCandidate:B L MaFull Text:PDF
GTID:2268330425997365Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Cloud computing is a new parallel technology, which has achieving tremendous development in academia and business, and a large number of cloud computing systems have been put into service. In heterogeneous environments, traditional Hadoop scheduler is inefficiency, it wastes longer response time and more system resources. Therefore, research and find the defect of traditional Hadoop scheduler algorithm, for improving the performance and increasing the utilization of resource has important practical significance.In the beginning, this paper introduces the development, background, present situation of cloud computing, and key technologies, framework of Hadoop platform. Then has an exhaustive research of the Hadoop’s job scheduling technology, on the foundation of analyzing four existing algorithm, which are FIFO scheduler, Fair scheduler, Capacity scheduler and Speculative task scheduler, through their thought, design, defects,we proposed a improved Speculative task scheduler. In the improved algorithm, the scheduler obtains the time proportions of each stage of Map and Reduce tasks according to recorded information, and distinguish between all the nodes detailed, for efficiently launching backup tasks of slow tasks on fast nodes with the consider of locality of data, so that decreases the response time, and increases the utilization of resource.Finally, we built the lab environment for testing the programming of improved algorithm, and compared improved algorithm with the four existing algorithm, to evaluate the performance of the improved algorithm. The results showed that our algorithm have reached its goal, which is able to make better use of system resources, schedule task effectively and show the good performance in heterogeneous environments.
Keywords/Search Tags:Hadoop, Job Scheduling, Heterogeneous environments, Locality of data
PDF Full Text Request
Related items