Font Size: a A A

Research And Implementation Of Range Query Algorithm Based On Uncertain Data

Posted on:2015-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:G GanFull Text:PDF
GTID:2308330473953710Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the continuous development of computer technology, many applications such as radio frequency identification (RFID), GPS guidance, wireless sensors, radar speed etc start to occur frequently. Due to the limitations of collecting information technology and storing information technology, as well as precision limitation of instruments and equipment, resulting in the generation of uncertain data. Inherent uncertainty of uncertain data introduces a demand of probability, the probability can describe the uncertain correlation values of properties. While traditional database management technology has reached a more mature stage, but it did not consider the demand of probability, traditional data management techniques are no longer suitable for uncertain data, so it presents new challenges to manage uncertain data for database theory and technology. Currently, although uncertain data using probabilistic data model represents the major probability model can represent a strong correlation among properties of uncertain data, but in terms of queries and probabilistic reasoning relatively cost high time complexity. Therefore, the study of an efficient indexing and fast search algorithm for supporingt uncertain data is also the focus of the current hot issue.In view of the uncertainty problem of data query, this thesis has carried on the analysis and research. Firstly, we propose a new index structure S-Box to manage uncertain data. This structure can efficiently support range query based on multidimensional probabilistic data. S-Box is a kind of modified by R-tree. In the index structure, we mainly put forward a new framework named skeleton. In each node of S-Box, we have recorded a group of skeletons, which can provide a very tight boundary constraint to query on the space. So that it can filter out the objects without overlapping with the query area, reducing the query access cost and time cost, improving the query speed.Secondly, we put forward a new data structure named BBD+-tree to manage uncertain data object. BBD+-tree use a multi-resolution grid to mark probability density function of the uncertain object, providing a compact bound on the probability for a query.Moreover, based on the index structure S-Box, we put forward two kinds of query algorithm on uncertain data range queries. Algorithm SBO makes use of S-Box to return uncertain data objects that appeared in a query region with a probability greater than a given probability threshold value. With the skeleton and pruning strategy, algorithm SBO reduces the search space and computing cost, thus improving the query speed and efficiency. Further, we put forward the optimization algorithm SCFB of SBO, the basic thoughts of algorithm SCFB is essentially the same with SBO, the only difference is that a filter CF we design between the BBD+-tree of objects and leaf nodes of S-Box. If the object can be pruned away directly by the CF, then there is no need to access the BBD+-tree, which can reduce the time of visit BBD+-tree. Otherwise, it needs to continue to access the BBD+-tree. Algorithm SCFB on the time overhead is optimized further, making queries more quickly and efficiently.Finally, in order to analyze and verify the performance advantages of structure and the algorithms, we designed a lot of comparison tests. By the comparison analysis with typical U-Tree, UD-Tree, experimental results show that the index structure and the proposed algorithms have a good performance advantage.
Keywords/Search Tags:uncertainty data, multi-resolution grid, skeleton, range queries
PDF Full Text Request
Related items