Font Size: a A A

Research On Parallel Skyline Queries Over Uncertain Data Streams

Posted on:2019-11-21Degree:MasterType:Thesis
Country:ChinaCandidate:J LiuFull Text:PDF
GTID:2428330611993171Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Uncertain data streams as a special type of data streams generally exist in various practical applications such as environment monitoring,location-based service,financial stock trading and web information system,the efficient analysis on uncertain data streams has become an important research field in big data research.The skyline query over uncertain data streams plays a significant role in domains like financial data analysis,Internet service and wireless sensor network,which emerges as a hot research topic in big data community.For the skyline query over uncertain data streams,the current challenge lies in,on the one hand,due to the uncertain data streams in real applications often arrive continuously and rapidly,the traditional centralized skyline queries are hard to meet the increasing query requirements and it is urgent to study the parallel query processing methods;on the other hand,due to the diversity of users'query requirements,the traditional skyline queries are poor practical and it is urgent to find the query processing methods of new query definition.According to the aforementioned challenges,the study of the parallel skyline queries over uncertain data streams has extremely important significance in real life,and becomes an inevitable trend of the current research.The advance and widely used of current parallel computing environments such as high performance computing and cloud computing,provides powerful parallel processing capability for the parallel skyline query over uncertain data streams.Moreover,the n-of-N skyline query and k-dominate skyline query can efficiently address the problem that the current query methods are lack of practicability.Therefore,this dissertation deeply studies the parallel n-of-N skyline queries and parallel k-dominate skyline queries over uncertain data streams.To overcome the least flexibility of the skyline queries over uncertain data streams,this dissertation proposes an interval-stabbing based parallel uncertain n-of-N skyline queries scheme named PnNS.The scheme first partitions the global sliding window into multiple local windows according to a sliding window partitioning strategy,and maps all the local windows to the corresponding compute nodes,which could transform the centralized query processing over uncertain data streams into the parallel query processing with multiple compute nodes.Specifically,PnNS transforms the n-of-N query into stabbing query by an encoding interval strategy,in order to improve the query efficiency.Moreover,to further optimize the query processing,on the one hand,the monitor node maps the new arriving streaming items to the corresponding local windows according to a streaming items mapping strategy,which could realize the load balance on each compute node.On the other hand,a spatial index structure based on R-tree is used to organize the elements within each local window,in order to improve the dominance tests.Extensive experimental results demonstrate that,compared with existing methods,PnNS method not only can efficiently process the skyline query over uncertain data streams,but also can greatly improve the query flexibility.To address the problem that the number of uncertain skyline queries results is so numerous that cannot offer any practical insights efficiently,a dominance-capability based parallel uncertain k-dominate skyline queries method named PKDS is proposed.Firstly,The method defines the k-dominate skyline query problem over uncertain data streams.Secondly,PKDS maps the new arriving items to multiple compute nodes according to the streaming items mapping strategy based on sliding-window partitioning,in order to support the parallel processing for the k-dominate skyline queries over uncertain data streams efficiently.Specifically,an index structure based on the k-dominate capability of streaming items is developed to efficiently manage streaming items,which could greatly improve the k-dominance tests and further the efficiency of parallel k-dominate skyline queries over uncertain data streams.Extensive experimental results demonstrate that,PKDS method not only can reduce the results of skyline queries over high-dimensional streaming items to the scope that could give a better decision support,but also can greatly improve the query efficiency.
Keywords/Search Tags:Uncertain Data Streams, Skyline Queries, Parallel Queries, n-of-NQueries, k-dominate Queries
PDF Full Text Request
Related items