Font Size: a A A

Research Of Probabilistic Skyline Query Over Uncertain Data

Posted on:2012-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:S T LiangFull Text:PDF
GTID:2218330368987804Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent times, the skyline query on large-scale datasets has received great attention. Many of effective algorithms for the skyline query have been proposed. In previously conducted research, the proposed skyline query algorithms mainly focused on the traditional static datasets. Later, with further research, an increasing number of scholars in the above mentioned field shifted their attention to uncertain data. Subsequently, the scholars proposed related algorithms for the probabilistic skyline query over uncertain data.The uncertain data is a new type of data which attracts peoples'attention gradually because of its growing application fields as well as the development of the information technology. The existence of an uncertain data object or the attribute value of the uncertain data is not definite, but relates to some probability value. The uncertain data can be divided into different classes according to different principles. According to the granularity of the uncertainty, the uncertain date can be divided into two types:tuple-uncertain data and attribute-uncertain data. According to the continuity of the data's value, the uncertain data can be divided into two types:discrete uncertain data and continuous uncertain data.At present, there are some achievements having been achieved in the research on skyline query of uncertain datasets. However, most of existing algorithms are only suitable for tuple-uncertain data or discrete data with uncertain attribute values. Till date, there is no effective algorithm available to calculate the skyline probability of the data with uncertain attribute values and defined by a continuous probability distribution function. In fact, the above mentioned kind of data is widespread.This paper focuses on the continuous attribute-uncertain data. In the present research, we build the universal models for the continuous uncertain data and design two kinds of methods to represent the uncertain data based on uniform distribution and normal distribution, respectively. With further study, we propose the computing principles of probabilistic skyline for uncertain data with uniform distribution and normal distribution. Besides, we prove the correctness of the computing principles of probabilistic skyline for uncertain datasets with normal distribution. Based on the above mentioned principles, we propose an optimal algorithm which takes advantage of the indexing strategy and the pruning strategy to compute the skyline probability of the above mentioned kind of data. Furthermore, we improve the optimal algorithm and make it appropriate for the progressive probabilistic skyline query of the uncertain data stream. Finally, the obtained results of a large number of experiments demonstrate that the proposed algorithms are effective on large datasets.
Keywords/Search Tags:uncertain data, skyline, query, probability, normal distribution
PDF Full Text Request
Related items