Research On Algorithm Of Uncertain Data Flow Query Processing

Posted on:2016-07-10

Degree:Master

Type:Thesis

Country:China

Candidate:H Li

Full Text:PDF

GTID:2208330461987183

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the development of science and technology and the deepening of people’s cognition on query processing method, uncertain data has attracted extensive attention, and uncertain data query processing method becomes widely used in many fields. Research on uncertain data streams query gradually becomes one of the hot research topics in the field of database.This paper studies three kinds of uncertain data streams query processing algorithms, which are distributed top-k aggregation query algorithm on uncertain data streams, probabilistic skyline query algorithm on uncertain data streams and a clustering algorithm on uncertain data streams.Top-k query technology is commonly used in sensor network. The existing query technologies mostly use centralized mode of query processing, which have a large time overhead and communication overhead. In order to solve this problem, this paper studies the top-k aggregation query algorithm on uncertain data streams in distributed environment. To start with, three aggregation algorithms are proposed based on numbers of tuples. Then a hybrid solution comes out with these three algorithms. The framework of top-k aggregation algorithm named DAT is finally obtained based on the hybrid solution. The experimental results turn out that DAT algorithm can not only meet the need of precision and reduce time overhead, but also be better than centralized query on the communication overhead.As a method of solving multi-criteria decision problems, skyline query processing calculation is widely used in practical applications, such as market analysis, objective decision, etc. Based on the existing strengths and limitations of skyline query algorithm, an efficient probabilistic skyline query algorithm is put forward, which is called PSUDS, to against the inefficiencies of enumeration calculation methods. This method uses the bottom-up algorithm to get a preliminary result set, and then it uses insertion algorithm and deletion algorithm to update and maintain the result set. The effect result, which different parameters have on the size of p-skyline and running time, shows algorithm has a good scalability. Compared with the Baseline algorithm, PSUDS has a higher efficiency.Clustering method is widely used in the data streams, but most are for certain data streams. Although some of them are based on uncertain data streams, a great majority of the algorithms are not considered the distribution of data. To solve this problem, based on the concept of KL distance(relative entropy), this paper presents an efficient clustering algorithm for uncertain data streams which is called KL-Micro. Experimental results show that KL-Micro algorithm, fully considering the distribution of uncertain data and improving the quality of clustering results, has a higher accuracy and efficiency.Through the experiment and analysis can be concluded that the above uncertain data streams query processing algorithms have a higher precision and efficiency, and they have practical value.

Keywords/Search Tags:

Uncertain data streams, Top-k, Skyline, Clustering

PDF Full Text Request

Related items

1	Research On Algorithm Of Uncertain Data Flow Query Processing
2	Research On Skyline Query Over Uncertain Data Streams
3	Research On Parallel Skyline Queries Over Uncertain Data Streams
4	Research On Extending Skyline Queries Over Big Data
5	Uncertain Clustering Method And Its Application In Data Streams Processing
6	Research Of Skyline Query Processing On Uncertain Dataset Of WSNs
7	Research And Implementation On Clustering Algorithms In Uncertain Data Streams Environment
8	Study On Skyline Query Processing Techniques On Uncertain Data
9	Research On Fault-tolerant Parallel Skyline Query Technology Over Uncertain Data Streams In Cloud Computing Environment
10	Research On Distributed Parallel Skyline Queries Over Uncertain Data