Font Size: a A A

Research On Fault-tolerant Parallel Skyline Query Technology Over Uncertain Data Streams In Cloud Computing Environment

Posted on:2013-07-09Degree:MasterType:Thesis
Country:ChinaCandidate:G D WangFull Text:PDF
GTID:2268330422473802Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a solution of the problems of multi-criteria decision making and preferencequeries, skyline query processing plays an important role in many real applications.With the development of computer hardware and software, a large number of uncertaindata streams are generated in the fields of wireless sensor networks, location-basedservices, financial applications and wireless radio frequency identification, etc. Skylinequeries over uncertain data streams are widely applied in the fields of decision making,environment monitoring and data analysis. However, the characteristics of uncertaindata streams such as the uncertainty of data, real-time response and single-pass scanning,pose a huge challenge for the skyline queries over the streams. Specifically, when datastreams have high throughputs, the existing centralized skyline query techniques arehard to meet the query needs. Therefore, the study of parallel skyline query processingover uncertain data streams is realistically valuable.Fortunately, with its powerful computing ability and storage ability, cloudcomputing can support the parallel skyline query processing over uncertain data streamsefficiently. Nevertheless, frequent failures in data centers pose great challenges to theparallel skyline query processing. Once a query is affected by failures, not only theskyline results would be incorrect, but also the query performance would be seriouslydecreased and vast resources would be wasted. Accordingly, we study the problems ofparallel skyline queries and the corresponding fault-tolerant parallel queries overuncertain data streams.Firstly, to solve the problem that existing centralized methods for skyline queriesover uncertain data streams cannot meet the requirement for the queries in highthroughput data stream environments, a simple-parallel-model based parallel skylinequery method SMPS over uncertain data streams is proposed. In SMPS we partition thewhole global sliding window into several even local windows, and each peer node isonly responsible for the updating and skyline probabilities computation for thestreaming items in its window. Thus, we can distribute the skyline computation tasks tomultiple peer nodes, to achieve parallel query processing over uncertain data streams.Moreover, the tasks of computing the global skyline probabilities can be computed inindividual nodes, thus the peer nodes need not communicate with each other, which ismore suitable for the applications that are more concerned about the networkcommunication overhead. The experimental results show that compared with traditionalcentralized methods, SMPS can enhance the processing performance efficiently. Inaddition, SMPS can be well adapted to changes of the size of transport block, datadimension and the length of global sliding window.Secondly, to solve the problem of the failures of peer nodes in parallel skyline query processing over uncertain data streams, a simple-parallel-model based fault-tolerant parallel skyline query method SFTPS over uncertain data streams is proposed.In SFTPS we double the lengths of the local windows to save the backup data. Thedispatcher node periodically sends the heartbeat messages to each peer node. When apeer node is identified to be failed, the dispatched node will choose an idle node as theduplicate to carry on the query task which is formly executed in the fault node. Theduplicate can get the backup data from the fault node’s adjacent nodes. Theexperimental results show that SFTPS can effectively and fastly detect the failures andrecover the parallel query processing. Specifically, when two or more disadjacent nodesare failed, the time to recover query processing is almost the same as the recovery timewhen one node fails.Thirdly, to reduce the memory overhead of the parallel skyline queries overuncertain data streams, a distributed-parallel-model based parallel skyline query methodDPS over uncertain data streams is proposed. In DPS we partition the whole globalwindow into several local windows in an alternative manner. Every peer node does notneed to maintain any global window, but a local window and only computes theuncertain data in its own local window. Peer nodes can compute global skylineprobabilities by communicating with others. The experimental results show that DPSnot only can efficiently realize the parallel skyline query processing, but also can bewell adapted to changes of the size of transmission block, data dimension and the lengthof sliding window.Finally, to deal with the problem of the failures of peer nodes in cloud computingenvironments, a distributed-parallel-model based fault-tolerant parallel skyline querymethod DFTPS over uncertain data streams is proposed. In DFTPS we add a saver nodeto check whether a peer node is failed, and save the backup data in peer nodes. Once apeer node is identified to be failed, the saver node would pick an idle node as theduplicate to carry on the query task which is formly executed in the failed node. Theexperimental results show that DFTPS not only has good fault-tolerant ability, but alsocan be well adapted to the changes of the length of sliding window and the number ofthe failed peer nodes.
Keywords/Search Tags:Cloud Computing, Uncertain Skyline, Uncertain Data Stream, Parallel Computing, Fault-Tolerant Query
PDF Full Text Request
Related items