Font Size: a A A

The Research Of Job Schedulin Algorithms Based On Resource-aware For Hadoop Platform

Posted on:2015-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z HuFull Text:PDF
GTID:2268330431469151Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Cloud computing is a new emerging business model and computing model, which achieves commercialization of the computing capacity, storage resources services and software services to provide a more reliable, cheaper, faster way available to users. Ultimately, the resource owners and users achieve a separation. Big data is the data set that the time used the common software tools to capture, manage and process more than tolerable time consuming. The characteristics are volume, variety, velocity and so no. Cloud computing platform provides a based hardware for big data analysis, and big data analysis also provides application needs for the development of cloud computing. The concept that analysis is services is a product of development and integration of cloud computing and big data analysis.But under normal circumstances, the cloud computing platform is a heterogeneous cluster composed by a large number of nodes of difference performance. This makes cloud computing environments often appear load imbalance, affects the overall performance of the cloud computing system seriously. Therefore, the research of task scheduling technique for heterogeneous cloud platform is particularly important.Through study and research, firstly this paper elaborates the concepts and characteristics of cloud computing and big data. We do an in-depth research on MapReduce programming and Hadoop Distributed File System which are the key technologies of Cloud computing, and make the analysis and comparison for the popular parallel programming mode currently. What’s more, on the foundation of analyzing three existed algorithms design and the method of implementation, which are FIFO scheduler, Fair scheduler and Capacity scheduler, we creatively design a new job scheduling algorithms. In this paper, we named it as The Scheduler Algorithm Based on Resource-aware, referred to as Resource-aware scheduler, Abbreviated as RAS.The Resource-aware scheduler designed in this paper considers the heterogeneous of computing nodes, the diversity of amount of job’s data, the differences of the tasks. In this process, we introduce the concept of reasonable degree to characterize the scheme of task allocation, and use it to weigh the relationship among the computing nodes, the amount of job’s data, and the diversity of tasks. Both the operational efficiency and reasonable distribution of resources on cloud computing platform achieve the maximization of overall performance. Therefore, the method with maximum reasonable degree is the best task allocation strategy. Ultimately, Different computing nodes are utilized differently, the jobs of different types has different service, different jobs are executed differently, thereby improving overall response time and utilization of system.
Keywords/Search Tags:Cloud computing, Big data, Heterogeneous cluster, Scheduling algorithm, Resource-aware
PDF Full Text Request
Related items