Font Size: a A A

Research Of Skyline Query Processing Technology On Uncertain Data Stream

Posted on:2011-06-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y F QiFull Text:PDF
GTID:2178330338490132Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of computer network technology, a large number of data streams are generated in the field of financial information, weather information and wireless sensor networks. The complexity of the network environment brings uncertain characteristics of data streams, which makes the technology of processing uncertain data stream become increasingly important. Skyline query processing techniques are applied to make multi-objective decisions. Skyline queries on uncertain data streams have great value in many applications. The characteristics of uncertain data stream such as uncertainty of data, real-time response and single-pass, make a huge challenge for skyline query processing technology. According to the problem of object modeling method, object index structure, multi-source of data stream and multi-user queries, a skyline query method on uncertain data stream, a distributed skyline query method on uncertain data stream and a distributed sub-skyline query method on uncertain data stream are proposed in this paper.Skyline query methods on uncertain data streams are used to solve multi-objective decision making on uncertain data stream. To solve the problem of continuous probability density function modeling method for uncertain data, skyline probability computation and uncertain object index structure, an effective skyline query method on Gaussian model uncertain data streams (SGMU) is proposed in this paper. The method SGMU contains two algorithms: dynamic Gaussian modeling algorithm (DGM) and skyline query algorithm based Gaussian tree (GTS). The DGM algorithm samples data object and builds Gaussian model for uncertain objects in the sliding window of uncertain data stream. The data stream is transferred into probability density function parameters stream of uncertain objects by DGM. GTS algorithm establishes an R-tree index structure by the Gaussian model parameters. The R-tree structure is used to prune data object and reduce the load of computation. Theoretical analysis and simulation tests show that, compared to BNL (Block-Nested-Loop, short for BNL) which is a kind of skyline query method on uncertain data stream with no index structure, SGMU method can not only effective in modeling to support skyline query, but also effective in pruning uncertain data objects to improve the efficiency of skyline queries.Distributed skyline query methods can deal with the problem of skyline queries on distributed data stream. In the existing distributed skyline queries on data stream, the methods of dataset dividing and the skyline computation are researched in this paper. According to the type of dataset dividing, two skyline query methods based on SGMU are proposed. One is on horizontal divide uncertain data stream modeling by continuous probability density function (SHUCpdf), the other is on vertical divide uncertain data stream modeling by continuous probability density function (SVUCpdf). Both methods compute the skyline results of partial dataset in the distributed nodes, then query from the local skyline results. The schemes of dataset partitioning and skyline probability computation are the main differences between SHUCpdf and SVUCpdf. Theoretical analysis and simulation results show that SHUCpdf and SVUCpdf can get skyline results quickly with simple data structures comparing to SGMU. Meanwhile, both methods can get the global skyline results accurately and continuously.Subspace skyline query is proposed to solve queries with multi-user. Users may concern about different attributes of data objects, which brings multiple subspace for skyline queries. A subspace skyline query method on vertical divide uncertain data stream modeling by continuous probability density function (SSVUCpdf) is proposed to solve the problem of multi-user skyline query. Based on SVUCpdf, SSVUCpdf computes the skyline results of dataset on the single dimension in distributed nodes. Then the different subspaces can be constructed from various single dimension datasets easily. At the same time, SSUCpdf saves the user preferred subspaces by the user sub-skyline query queue, which can reduce the number of subspaces in high dimensional skyline queries, and returns the requests of most users rapidly. Theoretical analysis and simulation results show that, SSVUCpdf method can effectively reduce the query cost, and avoid the "dimension explosion" problem in the subspace skyline queries.
Keywords/Search Tags:Uncertain data stream, Skyline query, Gaussian model, Distributed skyline query, Sub-space skyline query
PDF Full Text Request
Related items