Font Size: a A A

Research On Resource Elastic Scheduling And Collaborative Allocation Algorithm For Distributed Streaming Data Processing System

Posted on:2021-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:Z L HuangFull Text:PDF
GTID:2428330611967602Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data,the increasing amount of data and calculations have greatly increased the difficulty of data processing.For big data processing technology,big data is not only affected by the amount of data,but also closely related to the real-time nature of the data.Real-time and accurate data can often provide the most effective support for scene prediction.Therefore,real-time data stream processing technology and distributed computing are getting more and more attention.For distributed stream data processing technology,the following problems are mainly encountered in task operation and distributed clusters: 1)Stream data processing tasks and the system layer do not know each other,and the cluster cannot dynamically adjust system resources according to the task layer load The resources of the cluster system layer need to be adjusted manually,and cannot be adjusted automatically to save resource consumption.3)The current general stream data processing framework,its own scheduling algorithm cannot meet the needs of dynamically modifying the scheduling scheme for task loads.In view of the real-time nature of streaming data processing and the efficiency of cluster resource utilization,this paper proposes a resource scheduling model based on the middleware scheduler that coordinates communication between task-level resources and system-level resources.The main work is:(1)A three-layer resource scheduling model is proposed,which connects the task layer resource scheduler and the system layer resource scheduler through a middleware scheduler.The task-level resource scheduler formulates a reasonable strategy for task resource allocation by monitoring the real-time delay of streaming data processing tasks and a dynamic resource scheduling model based on queuing theory.The system layer resource scheduler receives the resource adjustment message from the middleware scheduler,and dynamically increases or decreases resources according to the current use of resource nodes in the distributed system,thereby meeting the resource adjustment requirements of the task layer.And through the control of system layer resources,reduce the meaningless loss of cluster resources.(2)Based on the resource scheduling model,this paper further proposes a resource collaborative allocation algorithm based on multi-task scenarios.The algorithm is based on the task management weights preset by the system.By comparing the system delays under the simulation of various allocation strategies,the resource allocation plan that best meets the current system interests is obtained,and the calculated plan is processed through the resource scheduling model.So that the delay of the system reaches the best position that can be achieved based on task weights.This paper implements the resource scheduling model and multi-task resource collaborative allocation algorithm based on Storm on yarn.Through experiments,it is proved that the resource scheduling model can dynamically and accurately adjust the allocation of system resources.The allocation of resources also plays a reasonable allocation.By comparing with the original Storm On Yarn resource management framework,the results show that the threetier resource scheduling model has a better effect on the dynamic resource adjustment of streaming data processing tasks.
Keywords/Search Tags:distributed cluster, dynamic scheduling, collaborative allocation, stream data processing, resource allocation
PDF Full Text Request
Related items