Font Size: a A A

Dynamic Load Management Of Distributed Data Stream Processing System

Posted on:2010-08-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y L OuFull Text:PDF
GTID:1118360275499032Subject:Traffic Information Engineering & Control
Abstract/Summary:PDF Full Text Request
Recently, there has been much interest in building stream processing applications, such as stock markets, network monitoring, security surveillance, financial analysis, online transaction, healthy monitor, RFID-based object tracing, sensor applications and pervasive environments. In these typical applications, data are usually unbounded, continuous, huge in amount, fast arriving, time various and out bursting. The traditional data processing, which can deal with the snapshot queries perfectly, can not satisfy the requirements of these data stream applications. In recent years, researchers begin pay more attention to data stream management technologies, such as constructing and optimization of data stream management system (DSMS), data stream mining and so on.Load balancing is one of the key technologies to ensure the regular service and to improve the system performance of DSMS. The existed DSMS can not satisfy the requirement of distributed data stream processing because of the low scalability. In this dissertation, we study the basic structures, principles, realizations, characteristics and the main application fields. To deal with overload problems aroused by the variety of input data rate, we discuss the method of constructing load management system, further more study particularly the key technologies of load balancing, load-shedding and distributed multiple data stream join operation. The main work and contributions are the following:(1) A hierarchical overlay network (vRing) is proposed first. Then, a load balancing algorithm (vDDSLB) is proposed based on the vRing overlay network. vRing is extended from Chord. By using the network proximity information, vRing becomes a hierarchical overlay network. vDDSLB is a hierarchical load balancing algorithm. It constructs on the basis of vRing. When a node becomes overloaded, vDDSLB will load balancing in the sub-domain first. If the locale load balancing can not satisfy the load balancing requirement, it will launch the global load balancing. Because the most of the scheduling work are happened in the sub-domain, the system performance will be increased and the latency of data tuple will be decreased.(2) A load-shedding algorithm (LPBDLS) is proposed based on the linear programming method. The existed load-shedding algorithms focused on the query network located on a single node. LPBDLS is an inter-node distributed load-shedding algorithm. It takes the CPU power constraint not only, but also the network bandwidth constraint into account. The system throughputs are increased especially in tightly network bandwidth resource environment.(3) A distributed multiple data stream join algorithm (DMS-Join) is proposed. For the inherence of distribution of data stream, it is better to put the join operators on different node than to put them on a single node. Our algorithm can achieve higher performance under the bandwidth constraint.(4) A dynamic distributed load management system is proposed based on the hierarchical vRing overlay network. It takes advantages of the hierarchical feature and the network proximity of vRing to construct a hierarchical load management system.The study of load management technology in this dissertation provides the theory and application support for DSMS, it may have potential and important effect to improve the performance of DSMS.
Keywords/Search Tags:Distributed data stream, Load management, Dynamic, Load balancing, Load-shedding
PDF Full Text Request
Related items