Font Size: a A A

Research On Complexity-oriented Spatial Data Clustering Analysis Methods

Posted on:2009-05-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y YangFull Text:PDF
GTID:1118360275477241Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of data capture technologies, the diversification andmagnanimity development trend is shown. Huge amounts of spatial data faced onspatial information are collected. Spatial data mining techniques are neededurgently to find hidden useful knowledge. Spatial Data Mining which is animportant branch of spatial data mining for its strong cluster of practicality andefficiency, and research into related areas becomes a hot spot.Based on the in-depth research of spatial data mining, spatial clustering andexisting methods, the high efficient spatial data clustering analysis methodsoriented to spatial data complexity of the characteristic of the massivecharacteristic, high-dimensional characteristic, data with obstacles and constraintscharacteristic and multi-scale characteristic are researched in this thesis.According to the massive characteristic of spatial data, the classical K-Meansalgorithm which is good at clustering massive data set is used. An algorithm forinitialization of K-Means clustering center based on optimized division used thecharacter of spatial data is proposed. This method solved the existence problem ofpre-k value, cluster centers selecting randomly in classical K-Means algorithm.The efficiency and accuracy of clustering algorithm oriented to massive spatialdata is improved.According to the high-dimensional characteristic of spatial data, a fuzzyextended based algorithm of high dimensional spatial clustering is proposed. Thespare grids which are more important to estimate the edges of clusters areextended based on fuzzy sets to consider the correlativity of spatial data inadjacent grids in this algorithm. It aims at ameliorating problems ofun-smoothness clustering, unclear boundary of spatial clustering results, andexcessive meaningless small clustering of existence high-dimensional subspacespatial clustering algorithm, and clusters with high-dimensional spatial data moreefficiently.According to the spatial data with obstacles and constrains characteristic, a grid-based hierarchical spatial clustering algorithm is proposed. The advantage ofgrid-based clustering algorithm is inherited. The algorithm processes arbitraryshape obstacle and finds arbitrary shape clusters efficiently. Meanwhile, thehierarchical strategy is used to reduce the complexity of clustering in presence ofobstacles and constraints and the operation efficiency of algorithm is improved.According to the multi-scale characteristic of spatial data, a spatialmulti-scale clustering algorithm based on density-isoline is proposed. Thought ofcontour line is referenced in the algorithm, and the spatial clustering analysisresults under multi-scale condition are found efficiently by using of naturaldensity standard of density-isoline in this spatial multi-scale clustering algorithm.The methods which orient to spatial data characteristics of massivecharacteristic, high-dimensional characteristic, data with obstacles and constraintscharacteristic and multi-scale characteristic are researched. The feasibility, validity,and efficiency of each new method is demonstrated by simulate experiment.
Keywords/Search Tags:spatial clustering, spatial data complexity, initial clustering centers, fuzzy extending, obstacle grid, multi-scale clustering
PDF Full Text Request
Related items