Font Size: a A A

Research On Dynamic Resource Scheduling On Streaming Big Data Processing Platform

Posted on:2017-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:M L FanFull Text:PDF
GTID:2348330503492886Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Data stream processing is a kind of important computation form in big data processing area. In data stream processing platform, data continually flow into the corresponding data processing component in real time, and then feedback to users after being processed.In current resource scheduling in data stream platform, the default scheduling strategy set the resource allocation statically, makes it does not match between the resource allocation and resource demand of each computing container when the calculating load fluctuating. Aiming at this problem, based on the different and fluctuating resource requirements of computing container, a real time dynamic resource scheduling technology was proposed in this research. Firstly, we analysis the rule of the resource requirement along with the load change, and make an on-line resource prediction for the next time window, and then scheduling the resource(CPU and memory) to each computing container running in the platform according to the resource prediction in real time. At the same time, in the environment of multi-applications coexistence in the data stream platform, the computing container in one application can get more resources to process data when it needs more resource and it also can release more resources to other applications when it do not need anymore, in our research, which can improve the resource utilization and minimum scheduling overhead as the prerequisite. In this paper, the main contributions include:1) Data stream process platform oriented dynamic resource scheduling model of architecture. The model is divided into monitoring/execution level, predict agent level and scheduling decision making level to the bottom and up. Decoupling the function of dynamic resource scheduling and the execution mechanism, to realize prediction, scheduling decision and monitoring of parallel execution, mutual cooperation, and realize the function of resource dynamic scheduling together.2) Data stream process platform oriented online resource prediction method. Monitoring the CPU and memory usage of each computing container in applications of data stream process platform, analysis the load characteristics on the platform, using resource prediction model based on resource usage rate to make the online combination prediction for CPU and memory resource with the form of time window, as the basis of combination scheduling physical resource of each computing container.3) Data stream process platform oriented resource combination scheduling method. This method determines the additional and release resources of computing container based on the resource prediction result. Determining migrate the computing container according the scheduling result. By introducing scheduling strategy based on multi-objective constraint combination, we consider three goals: scheduling process produces the least cost, the calculation results of resource utilization rate as high as possible, the minim number of migration containers, and then choose the best computing container to the appropriate nodes.4) Integrated the above research results, based on open source data stream processing platform JStorm, designed and implemented dynamic resource scheduling system— D-JStorm. D-JStorm integrated the above function of monitoring, prediction and resource combination scheduling, to realize the resource scheduling and migration of computing container in the platform. D-JStorm does not need modify the code on the top of the application and has good compatibility.5) Systematic performance evaluation is performed on D-JStorm comparing with JStorm, the results show that the reduction of response time by 23.18% in average, the highest average shortened by 58.26%; the max increase of CPU resource utilization of 17.96%, and the average 11.39%; the highest increase of memory resource utilization of 88.7%, the average increase of 71.16%.
Keywords/Search Tags:Big Data, Data stream processing platform, Dynamic resource allocation, Prediction
PDF Full Text Request
Related items