Font Size: a A A

Study On Indexing And Range Query Processing Techniques For Uncertain Data

Posted on:2012-12-06Degree:MasterType:Thesis
Country:ChinaCandidate:S ChangFull Text:PDF
GTID:2298330467476239Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, as the rapid development of data collection technology, data resources which could be used by people have increased dramatically. The database which is a scientific technique for organizing, storing and managing mass data has been widely used in many fields. In addition, with people gradually in-depth understanding of the objective world, the uncertainty of data prevalent existing in reality applications has got more attention of research community. Adding an index in uncertain database can greatly improve the efficiency of data query. Therefore, how to introduce an effective index mechanism to manage uncertain data and effectively support query operation becomes a hotspot of database.Due to probability, range queries studied systematically in the traditional database fields can’t be directly used in uncertain database. In order to solve this problem, this thesis proposes two kinds of different index algorithms which can effectively support the probability threshold query based on uncertain data. The contributions of this thesis are summarized as follows:(1) The concept of marginal probability is defined and an index algorithm based on marginal probability is proposed. Using the marginal probability information attached to nodes, this thesis designs a group of algorithms which can rapidly calculate the lower bound and upper bound probability of the intersection of query region and uncertain region, and avoid directly calculating appearance probability by filtering uncertain objects. The index algorithm is completely dynamic which can insert and delete uncertain objects in any order. Moreover, arbitrary probability density function can be handled in this algorithm. Comprehensive experiments have been conducted and show that the index algorithm is an effective index algorithm which is superior to other index algorithm based on uncertain data from I/O and CPU time.(2) The rules of partition are given and an index algorithm based on partition is proposed. The basic idea of the algorithm is first dividing uncertain regions of uncertain objects into a number of tuples, then merging the tuples which have close MBRs into new units, finally filtering the objects through rules. The algorithm reduces uncertain objects which will enter into the candidate and improves the query efficiency. The index algorithm supports querying uncertain objects with arbitrary probability density function and is not sensitive to the size and shape of the query regions. Extensive experiments have been conducted and show that compared with other index algorithm based on uncertain data, this algorithm has better query processing performance.
Keywords/Search Tags:Uncertain Data, Range Query, Multi-Dimensional Index, MarginalProbability, Partition
PDF Full Text Request
Related items