Font Size: a A A

Research And Implementation Of Resource-adaptive Scheduling Strategy For Hadoop Clusters

Posted on:2017-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:J Y YangFull Text:PDF
GTID:2348330503989804Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development and deep practice of big data technology, Hadoop YARN(Yet Anouther Resource Negotiator) scheduler is no longer an effective solution in heterogeneous cluster environment. On the one hand, YARN's resource scheduling system can not dynamically adjust the number of tasks based on nodes' computing power in heterogeneous clusters, which leads to a waste of better nodes' resources and make the system's overall performance poor exploited.On the other hand,YARN's resource scheduling strategy always allocates standardized resources to tasks without considering job's different resource requirements,which causes a large amount of debris and degrades the resource utilization of the Hadoop System.Based on the above problems, this paper puts forward a resource-adaptive scheduling strategy after exploring YARN's scheduling mechanism.F irst of all, cluster's monitoring server monitors all nodes' and jobs' performance.Secondly, the system evaluates the computing power of each node with using the real-time monitoring data.Finally, cluster's master node chooses whether or not to start the dynamic resource scheduling strategy based on the similarity assessment with considering nodes' and jobs' performance monitoring information.The optimized system can distinguish the heterogeneity of different nodes,allocate resources for tasks' real-time needs dynamically,refine YARN's scheduling semantics and also can be used as a secondary resource scheduling strategy of the upper scheduler.Implement the resource-adaptive scheduling strategy with the Hadoop2.0 and Ganglia system, and make performance tests on typical CPU intensive operations and I/O intensive operations.Experimental results show that the system can shorten jobs' completion time,increase cluster's concurrency and improve Hadoop's resource utilization effectively.
Keywords/Search Tags:heterogeneous cluster, computing power, resource-adaptive scheduling
PDF Full Text Request
Related items