The Study Of Modified Affinity Propagation Clustering And It's Application

Posted on:2018-05-15

Degree:Master

Type:Thesis

Country:China

Candidate:D Tang

Full Text:PDF

GTID:2348330512478575

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

Cluster analysis is a key branch of multivariate statistical analysis,which has been widely used in various fields of social life.Affinity Propagation Clustering Algorithm is a new-kind unsupervised Clustering Algorithm.It was put forward by Frey and Dueck in 2007.In this algorithm,the initial clustering center and numbers are not needed in advance.When the similarity matrix and Preference was constructed,the algorithm will automatically gain appropriate clustering points through message passing system.Preliminary studies show that the algorithm has many advantages,such as a fast calculation speed,a low margin of error squares sums,a high clustering accuracy.However,there are also some disadvantages.First of all,the AP algorithm chooses negative Euclidean distance as the similar-ity measure,but the Euclidean distance is only available in independent samples that is susceptible to dimension and it equally treat the importance to distance by each attribute.This paper presents the weighted Mahalanobis distance based on the mean square error,and taking the negative of this distance as the similarity measure of AP algorithm.Mahalanobis distance can adaptively adjust the distribution of data.The weighted Mahalanobis distance based on mean square error takes the attributes' influ-ences into consideration.It not only expands the scope of application of the algorithm,but also improves the accuracy of the clustering results.Secondly,the AP algorithm sets P into the same value,which admits the same possibility of the data points becoming class represent but ignores the influences on point becoming class represents brought by data distribution characteristics.To solve this defect,this paper proposed the establishment of P based on the membership sum of all the other points to to one point and the greater sum is more likely to become the class represent.Setting P value according to the data distribution means attaching P higher value to the points that have much possibility of becoming class represent,which can reduce the running times and time.Finally,in order to gain the clusters from 1 to k,an adaptive step size,dynamically adjusting P value and Gap index estimating optional clustering numbers are put for-ward.

Keywords/Search Tags:

Affinity Propagation Clustering, weighted Mahalanobis distance, mem-bership, Preference, Gap statistic

PDF Full Text Request

Related items

1	Improved Affinity Propagation Clustering Algorithm Based On Multiple Theories And Its Applications
2	Research On Affinity Propagation Clustering Algorithm
3	Based On Affine Propagation Clustering Algorithm To Improve Research
4	An Improved Affinity Propagation Clustering Algorithm For Reducing Complexity
5	Research Of Quantum Affinity Propagation Clustering Algorithm
6	Research On Affinity Propagation Clustering Algorithm Based On Manifold Distance And Density Adjustment
7	Research On Adaptive Affinity Propagation Clustering Algorithm Based On Neighbor Similarity
8	Research On Affinity Propagation Clustering Algorithm For Probabilistic Undirected Graph Model
9	Improvement Study Of Affinity Propagation Clustering Algorithm And Its Applications
10	Research And Application Of Clustering Parallel Strategy For Affinity Propagation