Font Size: a A A

Research Of Afifnity Propagation And Application On High Dimensional Data

Posted on:2013-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y L LiaoFull Text:PDF
GTID:2248330374475897Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
In2007, Frey, et al in 《Science》 magazine first proposed Affinity PropagationClustering, AP. Affinity Propagation Clustering uses similarity matrix to constructresponsibility matrix and availability matrix. According to close neighbors information,Affinity Propagation Clustering iteratively improves the responsibility matrix and availabilitymatrix. The final cluster centers will be obtained using the responsibility matrix andavailability matrix. Affinity Propagation Clustering with its stability, and good clusteringeffect, was concerned by many scholars. However, as the other clustering algorithms based onpartition, Affinity Propagation Clustering is more suitable for spherical data set. So how tomake the Affinity Propagation Clustering applicable to the data set with more complex datastructure, is an important research direction of the Affinity Propagation Clustering.Based on existing researches, this paper proposes corresponding solutions focusing onhow to expand the scope of application of Affinity Propagation Clustering, how to improvethe clustering effect of Affinity Propagation Clustering, and how to apply AffinityPropagation Clustering to high dimensional data.Firstly, this paper introduces the principle and characteristics of Affinity PropagationClustering, and explains the significance of the iteration steps through the analysis of theiterative process. Combined with the decisions matrix, this paper analyzes the properties ofAffinity Propagation Clustering, and also introduces the application of Affinity PropagationClustering.Secondly, since Affinity Propagation Clustering can’t produce good clustering resultswhile clustering data set with complex data structure, so after doing some research onpath-based similarity measure, we propose a new Affinity Propagation Clustering withshortest path-based similarity measure. Experiments show that the improved algorithm canobtain better clustering results on complex shape data set.Finally, this paper introduces the random projection dimensionality reduction methodsand the cluster ensemble methods. Combining the advantages of two methods, we proposecluster ensemble solution based on random projection dimensionality reduction methods.Combining with Affinity Propagation Clustering and Affinity Propagation Clustering with shortest path-based similarity measure, this method is suitable for clustering the highdimensional data set. Experiments show that these combinations can obtain better clusteringresults on high dimensional data set comparing with k-means algorithm and EM algorithm.
Keywords/Search Tags:Affinity Propagation, Clustering, Random Projection, Cluster Ensembles
PDF Full Text Request
Related items