Font Size: a A A

Research On Task Scheduling Algorithms Based On Pre-Release Resource List In Hadoop

Posted on:2017-12-21Degree:MasterType:Thesis
Country:ChinaCandidate:J ChenFull Text:PDF
GTID:2428330488971860Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Big Data era has come because of the development of Internet technology,Currently,Big Data has become one of the most important research issues.In the case of the mass of data,a single computer is unable to meet the requirements of storage and computing.A variety of big data computing model and distributed computing systems began to emerge.MapReduce is one of the most classic and most popular big data computing model.Task scheduling and resource allocation have always been a key technologies of large-scale distributed clusters,which is especially important to improve the efficiency of large-scale distributed clusters.The method of hadoop assigns task is each slave send a heartbeat request to the master.Existing task scheduler only based on the current task request from the slave to select task assignment.Without more resources to the real needs of jobs linked to make better scheduling scheme.Therefore,this paper analysis and research MapReduce speculative execution algorithm and task scheduling algorighm,and put forward relevant improvement algorithms.Details are as follows:(1)Proposed pre-release resource list.According to Job history information and current status of Hadoop cluster,which can evaluate task processing rate of node and task remaining time.Then build a pre-release resource list,making task sheduling of hadoop has more room for optimization.(2)Design a task speculative execution algorithm based on pre-release resource list.From the pre-release resource list,select a faster resource to make the slow task done faster.Experiments show that task speculative execution algorithm based on pre-release resource list can effectively make the presence of the slow task of the job done faster.(3)Design a task scheduling algorithm based on pre-release resource list.Based on the task processing rate of job and the data locality on each node to build a pre-release resource list for each job.By mathing the jobs and pre-realease resource list to find the most suitable resource for each job.Experiments show that the algorithm can effectively improve the performance of the cluster.
Keywords/Search Tags:Task Scheduling, Speculative Execution, Big Data, MapReduce, Hadoop
PDF Full Text Request
Related items