Font Size: a A A

Research On Algorithm Of Uncertain Data Flow Query Processing

Posted on:2016-07-10Degree:MasterType:Thesis
Country:ChinaCandidate:H LiFull Text:PDF
GTID:2208330461987183Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of science and technology and the deepening of people’s cognition on query processing method, uncertain data has attracted extensive attention, and uncertain data query processing method becomes widely used in many fields. Research on uncertain data streams query gradually becomes one of the hot research topics in the field of database.This paper studies three kinds of uncertain data streams query processing algorithms, which are distributed top-k aggregation query algorithm on uncertain data streams, probabilistic skyline query algorithm on uncertain data streams and a clustering algorithm on uncertain data streams.Top-k query technology is commonly used in sensor network. The existing query technologies mostly use centralized mode of query processing, which have a large time overhead and communication overhead. In order to solve this problem, this paper studies the top-k aggregation query algorithm on uncertain data streams in distributed environment. To start with, three aggregation algorithms are proposed based on numbers of tuples. Then a hybrid solution comes out with these three algorithms. The framework of top-k aggregation algorithm named DAT is finally obtained based on the hybrid solution. The experimental results turn out that DAT algorithm can not only meet the need of precision and reduce time overhead, but also be better than centralized query on the communication overhead.As a method of solving multi-criteria decision problems, skyline query processing calculation is widely used in practical applications, such as market analysis, objective decision, etc. Based on the existing strengths and limitations of skyline query algorithm, an efficient probabilistic skyline query algorithm is put forward, which is called PSUDS, to against the inefficiencies of enumeration calculation methods. This method uses the bottom-up algorithm to get a preliminary result set, and then it uses insertion algorithm and deletion algorithm to update and maintain the result set. The effect result, which different parameters have on the size of p-skyline and running time, shows algorithm has a good scalability. Compared with the Baseline algorithm, PSUDS has a higher efficiency.Clustering method is widely used in the data streams, but most are for certain data streams. Although some of them are based on uncertain data streams, a great majority of the algorithms are not considered the distribution of data. To solve this problem, based on the concept of KL distance(relative entropy), this paper presents an efficient clustering algorithm for uncertain data streams which is called KL-Micro. Experimental results show that KL-Micro algorithm, fully considering the distribution of uncertain data and improving the quality of clustering results, has a higher accuracy and efficiency.Through the experiment and analysis can be concluded that the above uncertain data streams query processing algorithms have a higher precision and efficiency, and they have practical value.
Keywords/Search Tags:Uncertain data streams, Top-k, Skyline, Clustering
PDF Full Text Request
Related items