Font Size: a A A

Resource Management System For Big Data Cloud Platform

Posted on:2019-04-12Degree:MasterType:Thesis
Country:ChinaCandidate:C LiFull Text:PDF
GTID:2428330563986013Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of network technologies such as the Internet,mobile Internet,sensor technology and Internet of Things,huge amounts of data have been continuously produced and the world has entered a big data era.Big data technology can effectively solve the traditional stand-alone system is facing the inability to store huge amounts of data,providing ultra-high computing capacity of the problem.Large data cloud platform has the characteristics of large scale,wide range of users and many applications.Many big data computing frameworks are not only a waste of resources but also a complex operation and maintenance.Therefore,how to schedule and manage various tasks and make reasonable use of the resources of the cloud platform Allocation is of great significance to ensure the smooth operation of the system and improve the quality of service.Based on Hadoop YARN and Kubernetes,this paper designs and implements a unified resource management system that supports a variety of big data computing frameworks,which can improve resource utilization of the cluster and reduce the complexity of operation and maintenance.The main work done in this paper and the achievements made are as follows:1,Research and analysis on the status quo of big data platform and cloud computing,discussed the development significance of unified resource management platform,and analyzed the existing problems of resource scheduling module.2,On the basis of deeply studying the system architecture of Hadoop YARN and its resource scheduling module,this paper analyzes the problem of resource waste caused by its own resource schedulers using reservation strategy,and proposes a resource allocation algorithm based on reservation backfill.The experimental data show that reservation backfill algorithm can improve the cluster resource utilization in the case of more large tasks.3,On the basis of researching the architecture of Kubernetes container scheduling system and workflow of its resource scheduler,this paper supplements the scheduling algorithm based on hostname matching and resource dynamic the problem that its own algorithm library has few applicable scenarios.Experiments show that improvements of Kubernetes cluster scheduler in this article can make it deal with more scenarios.4,Based on the container technology,a big data cloud platform resource management system is designed and implemented,ZooKeeper and etd are used to solve the single point failure problem of scheduler,and integrate with a variety of computing frameworks such as MapReduce,Spark and Flink to support large data storage and mining analysis cloud service platform.
Keywords/Search Tags:Big Data, Resource Scheduler, Hadoop, Cloud Computing, Docker
PDF Full Text Request
Related items