Font Size: a A A

Research On R-tree Index Method Based On The Improved Clustering

Posted on:2017-06-23Degree:MasterType:Thesis
Country:ChinaCandidate:H Y CuiFull Text:PDF
GTID:2348330482486645Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of GIS technology,the field of spacial index technology application,whose major function is to realize efficient storage of data and further increase the efficiency of updating and querying, has become increasingly wider. Since the spatial object,whose spatial relation is complex, has characteristics of heterogeneity,complexity and uncertainty, how to build an effective spatial index structure to query and update data has became a tough issue in the current database field. As a height-balanced tree, the R-tree can store spatial data efficiently, make up for the deficiency of traditional relational database when handling spatial data.It is mainly applied in the business spatial database.This paper is divided into following parts to research on the methods of building and querying the index structure in the spatial database:First, facing the deficiency of the traditional R tree index structure dynamic generation algorithm, the static batch generation algorithm is proposed.Introducing CURE clustering algorithm to pre-process data, and taking advantage of constriction factor to make data more compact, then the spatial utilization of nodes has been improved. After clustering, the similarity among leaf nodes is rather low, so that the overlapping ratio of MBR has been reduced efficiently.Second, according to the shortcoming of the selection of initial value and the sensitive to isolated points of the K-means algorithm when processing spatial data, this paper takes advantages of CURE algorithm to improve the K-means,that has avoided the influence of arbitrary initial values and noise points on the clustering results, and has reduced the time complexity of hierarchical clustering.The R tree which is built on the basis of improved clustering algorithm has promoted the efficiency of index building and nodes query.Finally, in order to store the uncertain data with high dimension efficiently,this paper has coded uncertain data according to the dimension deduction feature of the Hilbert curve, in that way, the calculation of expected distance between each data points has been avoided, and the amount of calculation has been reduced. The introduction of CURE clustering method has made the index structure more efficiently when dealing with isolated point and massive uncertain data set. Furthermore,the index structure which can effectively store uncertain data and its corresponding probability threshold query algorithm is proposed.
Keywords/Search Tags:R-tree, index structure, CURE algorithm, Hilbert curve, uncertain
PDF Full Text Request
Related items