Font Size: a A A

Based On Improved GMM Clustering Algorithm Research On The Construction Method Of Hilbert-R Tree

Posted on:2019-01-04Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y JiangFull Text:PDF
GTID:2428330548495819Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of geo-information system,spatial database has played an important role in many areas.As spatial data bears characteristics of complexity,diversity and magnanimity,the traditional relational database based on B tree cannot achieve good storage and quick access.Thus,spatial index technology emerges as the times require.One of the hot and tough issues in the current spatial database research is how to achieve a good index structure and reduce redundant query path.R tree is a spatial index structure that use minimum wrapped rectangle to approximate represent spatial data,in this way,spatial data can be stored and accessed effectively.The Hilbert-R tree is one of the variants of the R tree.It adopts fractal dimension technique to reduce dimension of spatial multidimensional data,so that the amount of calculation has been reduced and the purpose of construct index quickly has been achieved.However,after the data in the Hilbert-R tree is mapped,the nearest object in the original physical space may not be in a leaf node,so errors or redundant paths may appear in nearest neighbor queries,the quality and efficiency of index will be reduced.In view of the shortcomings of the existing traditional and improved spatial index,this thesis mainly focuses on the following three aspects:1.Aiming at shortcomings of Hilbert-R tree whose physical space adjacent data points may not be in one node,the clustering algorithm GMM(Gauss Mixture Model)is firstly adopted to pre-process special data.The GMM is capable of achieving high similarity among clusters and low similarity between clusters,in this way,the adjacent data is ensured in the same leaf node.And due to the low similarity between nodes,the overlap of MBR(Minimum Bounding Rectangle)is reduced and the efficiency of index is improved.2.According to deficiencies of GMM about arbitrary initial value and local optimal solution,method of using hierarchical clustering algorithm CURE to divide data,the expectation and variance of the data are obtained after the iterated layer.As the expectation and variance at this point are no longer arbitrary,it can be used as the initial value of GMM algorithm.In this way,the influence of noise points or outliers on clusters is avoided,the quality of cluster is improved,thus the data distribution in the node is more compact and the query efficiency is improved.Bee colony algorithm is used to make up for the deficiencies of local optimum and to improve the quality of whole Hilbert-R tree index structure.3.In accordance with the shortcomings of existing spatial uncertain data about constructing index,the improved GMM algorithm is taken to generate Hilbert-R tree,the amount of integral computation is thus reduced.At the same point,the adoption of bottom-up static batch mode has ensured the whole Hilbert-R tree of full capacity except for root node,the spatial utilization of index has been improved.As the final purpose of a good index is to query efficiently,in this thesis,a method of combining clustering with Voronoi is proposed to prune uncertain data,reduce the time to search for objects and improve index efficiency.
Keywords/Search Tags:Spatial database, Hilbert-R tree, GMM lgorithm, CURE, Voronoi diagram, Uncertain data
PDF Full Text Request
Related items