Font Size: a A A

Research Of High-Dimensional Space Query Algorithm Based On Space-Filling Curves

Posted on:2011-06-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:H B XuFull Text:PDF
GTID:1118330332471638Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The spatial database is a popular research of the database domain. The spatial database breaks through the traditional database system, which applys for the application of the information about the data of text and number. The spatial database can store and analyze massive data, which has complex structure. The spatial database is a kind of database system, which can store spatial and non-spatial data. Data model and query language of the spatial database sustains spatial data type, spatial index structure, spatial query and spatial analysis.Recently, the application of high-dimensional data appears in many domains, such as data warehouse and multimedia query based on the content of objects. In real query process, such as electronic commerce, medical diagnosis, the users not only have high demand of query precision, but also have high demand of query time. Therefore, these application systems must have high query efficiency. There is a need to study fast query algorithms of massive high-dimensional space.The spatial query algorithms based on Brute-Force method, R-tree, VA-file and NB-tree achieve better performance in low-dimensional space, but their performance suffers greatly in high-dimensional space. The reduction of the dimensionality is the key to the spatial query in high-dimensional space. The space-filling curve is a mapping method from high-dimensional space to linear space, divides the whole space into equal-size grids, and imposes a linear order of the points in the grids.Because the performance of above algorithms suffers greatly in high-dimensional space, the paper uses the characteristics of reducing dimensionality and clustering data of space-filling curves. According to the procedure of constructing the space-filling curves, divide high-dimensional space into equal-size grids. Therefore, map the points lying in the grids into linear space.The paper presents a high-dimensional approximate k-nearest-neighbor query algorithm AKNN based on Hilbert curve. The algorithm AKNN transforms the points in d dimensional space (d+1) times, maps the points into linear space, uses B~+-tree to store the points. The algorithm AKNN scans k previous and successive points of the query point in B~+-trees, and chooses 2k(d+1) points which are nearest as approximate k nearest neighbors.The paper presents a high-dimensional approximate k-closest-pair query algorithm AKCP based on Z curve. The algorithm AKCP scans the points (d+1) times linearly, calculates the minimum grid of the current point and its kth successive point. When the border length of the minimun grid is greater than the muminun of the candidates, the current point is neglected.Using the notion Z-region to cluster similar data, the paper presents an improved index structure B~Z-tree, a depth-first range query algorithm B~ZRQ in high-dimensional space. The algorithm B~ZRQ uses effective strategy to cut the branches, scans B~Z-tree quickly.The paper presents a grid-partition cluster algorithm HC based on Hilbert curve. First, the algorithm HC merges numerous small clusters based on the grids. Then it merges the clusters in many rounds. Finally, it gets large better clusters.The experiments show these algorithms are applicable and effective.
Keywords/Search Tags:space-filling curve, high-dimensional space, closest-pair query, nearest-neighbor query, range query, cluster query
PDF Full Text Request
Related items