Font Size: a A A

Load Balancing Of Distributed Data Stream Processing System Research And Implementation Of Technology

Posted on:2013-09-23Degree:MasterType:Thesis
Country:ChinaCandidate:C Y HuangFull Text:PDF
GTID:2248330374486324Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, with the development of computer network technology, a largenumber of flow-based data applications are developed in network service security,financial data analysis, mobile equipment communication and other fields. Since thedata stream applications often deal with a large number of data, the system needs toconnect different nodes together to form a distributed computing environment whilekeeping load balancing among these nodes. However, the burst characteristics and theuncontrollability of the flow data in distributed stream processing system make thetraditional distributed load balancing technology failed to meet the requirements. So theload balancing technology of stream processing system becomes one of the hot spots incurrent researches.This thesis mainly solves the load balancing problem in distributed streamprocessing system. Firstly, we start our research on both static and dynamic aspects.Static load balancing is the problem of the operator allocation which takes into accountthe sub-uniform distribution to each processing node. Considering the differences ofprocessing performance for different nodes, we propose static load balancing algorithmwhich is based on load weights. The algorithm firstly is to split global operator intooperator groups and big operators, then we can distribute them with uniform balancebased on node relative processing model and load weights and finally adjust thedistribution results for further balance. On the other hand, the dynamic load balancingon distributed stream processing system aims at reducing the overhead gap amongnodes while fully utilizing the system resources. In order to avoid the migration of thelag, we propose a dynamic load balancing algorithm based on predictive analysis.According to collect each processing node status periodically and select the sequencewith smallest prediction error, the dynamic algorithm calculates the load state of eachnode and determines how to migrate operators between the high overhead node and thelow one based on correlation coefficient.Finally, we design and implement a whole load-balancing solution for thedistributed stream processing system based on the static and dynamic algorithm. The cluster system consists of one management node and some data processing nodes. Thestatic load balancing algorithm completes the system global operator deployments andthen the dynamic load balancing algorithm dynamically completes the operatormigration among various nodes while the system is running.The load balancing algorithms and load balancing solution proposed in this paperhas fully satisfied the requirement for distributed stream processing system. We built adistributed data stream processing platform and made many specific experiments onload balance modules. The results demonstrate the desired effectiveness and show ahigh practical value.
Keywords/Search Tags:data stream, operator, distributed computing, load balancing
PDF Full Text Request
Related items