Research On The Spatial Clustering Analysis

Posted on:2016-04-20

Degree:Master

Type:Thesis

Country:China

Candidate:Y P Xiao

Full Text:PDF

GTID:2298330467487311

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Clustering or cluster analysis is an important branch in the field of datamining, and It has become a very comprehensive tool for identifying an internaldata structure. Clustering is an unsupervised pattern identifying the process ofdividing data objects into homogeneous classes which are called clusters. Objectsin every cluster are more similar to each other than objects from different clusters.The clustered case of spatial samples can be quickly and efficiently recognized.Meanwhile, the clustering analysis technology can extract the group spatialstructure characteristics of spatial data. Therefore, the clustering analysistechnology is playing an important role for revealing the distribution of spatialsamples and predicting the development trend of space objects.The research contents of this article mainly organized as following4partsfor clustering analysis technology in the field of data mining:First, for the traditional partitioning clustering algorithms, the traditionalk-means clustering algorithm is sensitive to initialization and easily traps intolocal optimum. In order to overcome this disadvantage, this article presents animproved k-means algorithm based on expectation of density. In this improvedalgorithm, we chooses the furthest mutual distance k sample objects as the initialcenters, which are belong to the expectation of density region. The experimentalresult shows that the improved k-means algorithm has the weak dependence oninitial data and obtains high clustering quality.In addition, the number of clusters k is difficult to establish in the actualcases for the traditional k-means clustering algorithm. Aiming at the shortcomingabove, We combine the improved k-means algorithm based on expectation ofdensity with the Silhouette validity index to analyze the clustering quality indifferent k values and determine the optimal number of clusters.Then, in this paper, we presents a fuzzy c-means algorithm combined animproved artificial bee colony algorithm with the strategy of rank fitnessselection. The strategy is aimed to increase the selection probability of the individual with better fitness. The proposed algorithm combines the advantagesof the high efficiency of fuzzy c-means algorithm and the global search ability ofthe artificial bee colony algorithm, and the proposed algorithm can overcome theshortcoming of the traditional fuzzy c-means clustering algorithm sensitive to theselection initial cluster centers.In the last, The analysis based on uncertain data has been one of the hottopics in data mining and knowledge discovery due to its reality and objectivity.In this paper, considering the uncertainty of data in real word and the fuzzyboundary between sample objects, we present a new uncertain clusteringalgorithm based on fuzzy c-means algorithm to organize and analyze uncertaindata. Finally, the experimental and analysis results demonstrate the feasibility andeffectiveness of the proposed algorithms.

Keywords/Search Tags:

expectation of density region, effective number of cluster, artificialbee colony algorithm, fuzzy clustering, uncertain data

PDF Full Text Request

Related items

1	Clustering Algorithm Of Position Uncertain Data Based On Connection Number
2	Density-based Uncertain Data Clustering Algorithm
3	Research On Uncertain Data Streams Clustering Algorithm Based On Tuple Cluster Feature
4	Research And Improvement Of Uncertain Clustering Algorithm For Interval Valued Data
5	A Preliminary Study Of Clustering Method For Uncertain Data
6	Research On Blocking Fuzzy Clustering Algorithm Based On Density Of Samples
7	Optimization Of Bee Colony Algorithm And Its Application In Clustering
8	Interval Number-Based Uncertain Data Mining And Its Applications
9	Density-based And Grid-baed Uncertain Data Stream Clustering Algorithm In Vulnerability Detection
10	Research On Model And Algorithm Of Fuzzy Cloud Computing Resource Scheduling