Font Size: a A A

R-Tree Index Construction Of Dynamic K-Means Algorithm

Posted on:2019-09-22Degree:MasterType:Thesis
Country:ChinaCandidate:B HanFull Text:PDF
GTID:2428330566464645Subject:Engineering·Software Engineering
Abstract/Summary:PDF Full Text Request
Purpose — In the current research,there are some defects in constructing R-Tree based on k-means clustering algorithm.On the one hand,the k value in k-means clustering algorithm is determined in advance and the difference of the information carried by each attribute.On the other hand,how to maintain the legitimacy of R-Tree and the good structure in constructing R-Tree.In order to solve the above two problems,this paper improves the k-means algorithm and the distance measurement and data allocation between the data in the process of R-Tree construction,improve the efficiency of R-Tree construction process and the efficiency of retrieval and other functions.Design/methodology/approach — On the basis of k-means algorithm,by combining the nearest neighbor theory,information entropy theory and probability statistics theory,the R-tree construction algorithm is optimized from three aspects: the selection and determination of initial center point,the distance weighting between data combined with the actual situation and the division of redundant data.Findings — Firstly,the traditional k-means algorithm is improved by selecting the initial center point and weighting each attribute,and the experimental verification shows that the iteration times and accuracy are improved.Secondly,it expands from the aspect of dynamically determining the number k of clusters,and puts forward an attribute weighted k-means algorithm based on neighbor model to dynamically determine the value k,thus obtaining an ideal number of clusters.Finally,using the proposed attribute weighted k-means algorithm based on nearest neighbor model to dynamically determine k values and the weights determined based on the area and perimeter of spatial data,a well-structured R-tree structure is constructed by data allocation principle.Research limitations/implications —(1)Definition of noise data;(2)Dynamic allocation of data in R-tree nodes.Practical implications — The proposed dynamic k-means algorithm can reasonably and efficiently give the optimal number of clusters of a data set,and is applied to the construction of R-tree structure to improve the construction efficiency and retrieval efficiency of R-tree structure.Originality/value — A dynamic k-means algorithm is proposed based on the nearest neighbor data and information entropy attribute weights,which can effectively obtain the number of clusters.The area and perimeter of spatial data are added to the distance between spatial data in the way of weight,and the distance is influenced by the shape of spatial data,which is more in line with the reality.
Keywords/Search Tags:Machine Learning, Data Mining, Dynamic K-Means, KNN, Spatial Index, R-Tree
PDF Full Text Request
Related items