Font Size: a A A

Research And Implementing Of Query Task Management In Data Stream Processing System

Posted on:2014-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:T LiFull Text:PDF
GTID:2268330401464479Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, with the rapid development of computer communication andnetwork technology, many data applications based on data stream are developed intraffic management, network monitoring and security, stock market analysis,telecommunications data management, sensor network query and other fields. Becausethese applications often deal with large-scale data, the system needs to put a plurality ofdifferent processing nodes connected together to from a distributed processingenvironment, and the query tasks of processing data are allocated in balance to eachprocessing node to keep load balancing in system.However, the randomness and theuncontrollability of the flow data make the tradition distributed load managementtechnology can’t meet the requirements. So How to manage the query tasks, making theentire data stream processing system’s load balanced becomes a hot spot in currentresearches.This thesis mainly solves how to manage the query tasks making these processingnodes’ load balanced in the distributed data stream processing system. We studied thisproblem from two aspects of static initialization and dynamic running. The main workand contributions are the following:1. In order to solve the static load distribution problem of the query tasks in thesystem, this article put forward a static load balancing algorithm which is based on theoptimal2-exchange algorithm by studying the existing algorithms. This algorithm notonly takes into account the differences of processing capability for different processingnodes, but also improves the shortcoming of heuristic algorithm. Query tasks in thesystem are uniformly allocated to the each processing node by the algorithm, to makethe task allocation more reasonable.2. In order to solve the phenomenon of few processing nodes’ load unbalancewhich appears in the running procedure of the data stream processing system, present aload balancing policy based on latency and load optimization by studying currentalgorithms. The policy considers processing nodes’ loads and data traffic among thenodes, and adopting dual-threshold divides the processing nodes into three states: high, low and normal, query task migration operations are done only between the processingnodes in high load state and the processing nodes in low load state, reducing theprocessing delay of user needs, improving the stability of the system.3. We design and implement the query tasks management solution for thedistributed stream processing system, which uses the static load balancing algorithm andthe dynamic load balancing strategy. The static load balancing algorithm completes thesystem all the query tasks initialization deployment, and then the dynamic loadbalancing strategy completes some query tasks effective migration between variousprocessing nodes while the system is running. Test and analyze the loads of all theprocessing nodes of system, It shows that the static load balancing algorithm anddynamic load balancing policy achieve the desired effect and are very useful forpractice.
Keywords/Search Tags:data stream, query task management, processing node, load balancing
PDF Full Text Request
Related items