Font Size: a A A

Research On The Spatial Clustering Analysis

Posted on:2016-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y P XiaoFull Text:PDF
GTID:2298330467487311Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Clustering or cluster analysis is an important branch in the field of datamining, and It has become a very comprehensive tool for identifying an internaldata structure. Clustering is an unsupervised pattern identifying the process ofdividing data objects into homogeneous classes which are called clusters. Objectsin every cluster are more similar to each other than objects from different clusters.The clustered case of spatial samples can be quickly and efficiently recognized.Meanwhile, the clustering analysis technology can extract the group spatialstructure characteristics of spatial data. Therefore, the clustering analysistechnology is playing an important role for revealing the distribution of spatialsamples and predicting the development trend of space objects.The research contents of this article mainly organized as following4partsfor clustering analysis technology in the field of data mining:First, for the traditional partitioning clustering algorithms, the traditionalk-means clustering algorithm is sensitive to initialization and easily traps intolocal optimum. In order to overcome this disadvantage, this article presents animproved k-means algorithm based on expectation of density. In this improvedalgorithm, we chooses the furthest mutual distance k sample objects as the initialcenters, which are belong to the expectation of density region. The experimentalresult shows that the improved k-means algorithm has the weak dependence oninitial data and obtains high clustering quality.In addition, the number of clusters k is difficult to establish in the actualcases for the traditional k-means clustering algorithm. Aiming at the shortcomingabove, We combine the improved k-means algorithm based on expectation ofdensity with the Silhouette validity index to analyze the clustering quality indifferent k values and determine the optimal number of clusters.Then, in this paper, we presents a fuzzy c-means algorithm combined animproved artificial bee colony algorithm with the strategy of rank fitnessselection. The strategy is aimed to increase the selection probability of the individual with better fitness. The proposed algorithm combines the advantagesof the high efficiency of fuzzy c-means algorithm and the global search ability ofthe artificial bee colony algorithm, and the proposed algorithm can overcome theshortcoming of the traditional fuzzy c-means clustering algorithm sensitive to theselection initial cluster centers.In the last, The analysis based on uncertain data has been one of the hottopics in data mining and knowledge discovery due to its reality and objectivity.In this paper, considering the uncertainty of data in real word and the fuzzyboundary between sample objects, we present a new uncertain clusteringalgorithm based on fuzzy c-means algorithm to organize and analyze uncertaindata. Finally, the experimental and analysis results demonstrate the feasibility andeffectiveness of the proposed algorithms.
Keywords/Search Tags:expectation of density region, effective number of cluster, artificialbee colony algorithm, fuzzy clustering, uncertain data
PDF Full Text Request
Related items