Font Size: a A A

Data Mining Algorithm Based On Affinity Propagation Clustering Analysis

Posted on:2016-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y P XieFull Text:PDF
GTID:2348330503454386Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of science technology and network technology, human society has entered the era of big data. Knowledge and information from the massive amounts of datasets need to obtain in many domains. Thus, the concept of data mining has emerged. Affinity propagation(AP) clustering algorithm as an important data mining algorithm is based on a sample of similarity measure to input similarity matrix to construct responsibility matrix and availability matrix. There is a real value of message exchange between data points and the degree of ownership by attracting degrees until an optimal class represents the point set(called cluster center) and clusters are gradually formed. Because AP clustering algorithm doesn't designate initial cluster centers and set the number of clusters. But AP clustering algorithm results are relatively poor and finally are affected by the preference parameter. This paper mainly focuses on AP clustering algorithm several drawbacks and the following aspects are mainly analyzed and studied :1.Aiming at complex data sets, affinity propagation clustering algorithm has shortcomings of clustering inefficiency and low accuracy. A semi-supervised affinity propagation clustering algorithm based on kernel function(K-SAP Clustering Algorithm) is proposed in this paper. This algorithm first maps the complex clustering space into the feature space and change the similarity measure by a kernel function. Then semi-supervised algorithm is used to adjust the similarity matrix to be neighbours of data in same cluster. Finally, AP algorithm is used to iterate and undate to get the global optimum. Simulation results show the proposed algorithm is better and more accurate than SAP algorithm for complex data sets clustering.2.Semi-supervised affinity propagation clustering algorithm has problems of low clustering precisoin and large amount of calculation for high dimensional data sets. A semi-supervised affinity propagation clustering algorithm based on locally linear embedding(LLE-SAP) is proposed in this paper. Firstly, this algorithm maps high dimensional input data set into low dimensional space by LLE Algorithm, and calculates the similarity matrix of the low dimensional data, then adjusts the similarity matrix by the semi-supervised algorithm, and finally, performs clustering analysis low dimensional data by affinity propagation clustering algorithm. Simulation results show that the proposed algorithm has higher precision and fewer iterations for high-dimensional data.3.The values of preference parameters in affinity propagation clustering algorithm(AP) have direct impacts on the clustering accuracy, and the empirical values preference parameters can not ensure the optimal clustering results. So an affinity propagation clustering algorithm based on the differential evolution is proposed in order to solve this problem. Firstly, the AP cluster analysis is done and the preference parameter adopts empirical value. Then whether the optimal preference parameter is determine according to the clustering results, preference parameter is as input population of the differential evolution algorithm if preference parameter is not optimal value. Finally the variation, hybridization and selection operations of the differential evolution algorithm is used to intelligently adjust the parameter, and the individual with the highest fitness value is selected as the preference parameters, and it is returned to cluster again. By using classic datasets, the experimental results of the class numbers, the correct rate and FMI three aspects show that the DE-AP algorithm can effectively solve the preference parameters influence on clustering results, and thus can improve clustering accuracy.
Keywords/Search Tags:Data Mining, Affinity Propagation Clustering, Kernel Function, Local Linear Embedding Algorithm, Differential Evolution Algorithm
PDF Full Text Request
Related items