Font Size: a A A

Research On Elastic Resource Scheduling In Large-scale Stream Data Processing

Posted on:2020-07-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:L N LiFull Text:PDF
GTID:1368330575478767Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
The large-scale stream processing system,as an important tool for the large-scale stream data processing,promotes the development and application of the big data stream computing technology,and provides essential helps in dealing with challenges of the big data era.The cloud-based elastic resource scheduling is an important part of the large-scale stream processing system,which affects the performance of the system and restricts the application in the system.Therefore,figuring out the elastic resource scheduling is of great significance for understanding the stream processing system in essence,improving the resource utilization of the system,and reducing the energy consumption.This thesis mainly focuses on the elastic resource scheduling in the large-scale stream data processing,which dynamically expands or shrinks resources according to the input load change of the application.The key of resource scheduling is to determine when resources to be adjusted and the amount of resource adjustments,so as to provide resources matching the load change in real-time.Resource scheduling involves both resource allocation and resource placement.Resource allocation refers to determining the timing and increasing or decreasing number of virtual machine adjustments.Resource placement implements the mapping between virtual machine that are allocated or released and physical machines.Specifically,we combine the collaborative load forecasting model,the feedback collaboration mechanism,etc.with the elastic resource scheduling,and consider resource allocation separately or consider resource allocation and resource placement in a unified manner from the systemic perspective to conduct the research work concerning the following three issues for different application performance objectives:·Elastic resource scheduling for the large-scale streaming data burstiness.Based on the correlation between the input load of upstream and downstream operations in the application,the data load forecasting model is constructed and the cooperation mechanism between operations is designed.Meanwhile,an effective resource allocation method is designed to meet the data quality performance of the application with the lowest system overhead.The load forecasting model constructed can be used for other active scheduling strategies,and provides a reference for data prediction of similar applications and related work.Meanwhile,the scheduling strategy provides guidance for effective control measures when unexpected situations occur in different fields.·Collaborative proactive elastic resource scheduling in the large-scale stream data processing.For computation-intensive and communication-intensive applications,considering the bandwidth,the latency model and the communication cost model are constructed.Combined with the improved load prediction model,a scheduling strategy with the lowerst energy consumption goal is designed from the systemic perspective,which considers resource allocation and resource placement in a uniform way and meets the latency performance.The research can be applied to a number of problems with different performance and cost requirements by adjusting parameters,and provide references and guidances on solving ideas and control strategies for solving similar problems in different fields.·Low-reconfiguration-cost proactive elastic resource scheduling in large-scale stream data processing.For the resource configuration with non-strict data synchronizations,with resource reservations as the core idea,the scheduling strategies are designed from the perspective of optimization and intelligence.In these scheduling strateges,the latency performance of the application is probabilistically met,and the effectivess of resources,the stability of the system and the reduced system reconfiguration overhead are guaranteed.The research provides new research perspectives and ideas for solving similar problems.At the same time,the intelligent method combined with the load forecasting is introduced into the elastic resource scheduling.Specifically,we have made the following contributions around the above research issues:1.Burstiness-aware elastic resource allocation in large-scale stream data processing.Based on the pipelined data processing mode and the time window mechanism,a data load correlation model and a bidirectional feedback control mechanism are designed.A load-aware optimal elastic resource allocation method is proposed,which pre-allocates resources matching the burst load to ensure the data quality of the application,high resource utilization and low reconfiguration cost.Compared with reactive and other proactive strategies under different workloads,the sensitivity of our proactive elastic strategy is evaluated,and the possibility of responding to sudden load is confirmed.2.Proactive elastic collaborative resource scheduling in large-scale data stream processing.From the perspective of joint optimization,aiming at minimizing the energy consumption,based on the system model such as the latency model and the energy consumption model,a collaborative proactive elastic resource scheduling strategy is proposed to satisfy the latency requirement of the application.This strategy consists of an optimal resource pre-allocation method in the pipeline processing mode and a communication-aware heuristic resource placement method,where these two methods are alternately executed to be an approximate optimal resource scheduling scheme.By comparative tests,the effects of the scheduling scheme are evaluated independently and comprehensively respectively,and the good performance of the algorithm and strategy proposed is verified.3.Low-reconfiguration-cost proactive elastic resource allocation in large-scale stream data processing.With the goal of minimizing the reconfiguration cost and maximizing the resource utilization,combined with the load forecasting model,two proactive elastic resource allocation strategies are proposed to probabilistically satisfy the latency requirement of the application.The first strategy is a heuristic resource allocation strategy,which is based on the local threshold method and the resource reservation idea;drawing on the strategy idea in non-flow elastic systems,the second one is an adaptive semi-model resource allocation strategy,which adopts the reinforcement learning technology and adjustment rules of the heuristic algorithm.The advantages of heuristics and intelligent methods are verified by comparing the performance and cost of different strategies.
Keywords/Search Tags:Stream data processing, load forecasting, elastic resource allocation, proactive resource allocation, collaborative resource scheduling, intelligent resource scheduling
PDF Full Text Request
Related items