Font Size: a A A

Improvement Study Of Affinity Propagation Clustering Algorithm And Its Applications

Posted on:2018-07-08Degree:MasterType:Thesis
Country:ChinaCandidate:P LiFull Text:PDF
GTID:2348330515984731Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Clustering,one of the unsupervised learning methods,is an important and hot research aspect of machine learning,data mining,pattern recognition,etc.Affinity Propagation(AP)is a new exemplar-based clustering method proposed by Frey and Dueck in 2007.This exemplar-based clustering algorithm considers all the data points as potential exemplars and identifies clusters automatically according to the message-passing among the data points,which can avoid many poor clustering solutions caused by unsuitable initializations and hard decisions.Some research works have shown that the AP algorithm can obtain better performance than the previous algorithms(such as K-means,K-medoids,etc.)in some application areas.Since it was proposed,AP algorithm has become an attractive clustering method and has been studied and applied in many domains,including face recognition,image segmentation and categorization,text mining and so on.Many improvement methods and extensions have also been developed.As a relatively new clustering algorithm,AP still has some problems.One of the key problems is how to determine the values of preference P(a vector{pk}Nx1,whose elements indicate how likely the relevant data point is to be chosen as an exemplar).The preference P significantly affects AP clustering result and improper values may lead to suboptimal clustering solutions.In most of AP or AP-based algorithms,all the values of element preferences are commonly set as a constant and kept unchanged in the clustering process.Meanwhile,for practical clustering problems,making all the data points share one common exemplar preference may not be appropriate.This ignores the information contained in the data distribution and may bring unnecessary computation in message iterative process.In this thesis,to solve the problem mentioned above,AP algorithm and its improvement are studied.The proposed improved AP algorithm is applied to the clustering of standard practical test datasets and the flow pattern identification of gas-liquid two phase flow respectively.The main work and innovations are listed as follows:1.A new AP algorithm,Adjustable Preference Affinity Propagation(APAP)algorithm is proposed.The idea of pk's different value-assign and automatic update are introduced.Firstly,the initial value of pk is determined by the corresponding data points' nearest neighbor similarity set.Secondly,each element preference pk is adjusted according to the mutual effects among the exemplars in the message-passing process.So,additional preference-adjusting constraint is introduced,the number of constraint function nodes rises,and the structure of factor graph changes as compared to that of the original AP.Experiments on synthetic data are carried out.Compared with the standard AP algorithm,APAP algorithm has a better overall performance in four evaluation indexes(Classification Rate(CR),Rand Index(RI),Normalized Mutual Information(NMI)and Number of Interaction(NI)).Meanwhile,the performance of the proposed APAP algorithm is compared with that of the multi-exemplar AP(MEAP)algorithm and the adjustable AP(adAP)algorithm.Experimental results verify that the proposed APAP algorithm is feasible and effective.2.The proposed APAP algorithm is applied to the clustering problem of real-world data.Four representative and commonly used datasets in standard test database(UCI database)provided by University of California Irvine,are used to the application study of APAP.Experimental results verify that the proposed APAP algorithm can obtain better performance than the standard AP algorithm(most indexes of APAP,such as CR,RI,NMI and NI,are better than that of AP).Experimental results also indicate that compared with other improved algorithms(MEAP algorithm,adAP algorithm),the proposed APAP algorithm has the advantages of smaller iterative number,less time and better stability.3.The proposed APAP algorithm is applied to the flow pattern identification of gas-liquid two phase flow.A new flow pattern identification method based on APAP is presented.Firstly,the signal features are extracted.Then,the proposed APAP algorithm is introduced to search for the exemplars of typical flow patterns.Finally,flow pattern is identified according to the nearest neighboring rule.Two kinds of sensors(12×6 photodiode array sensor and radial C4D(Capacitively Coupled Contactless Conductivity Detection)sensor)are adopted in the study.of flow pattern identification.Experimental results verify that the clustering results of APAP coincide with the actual flow patterns.And identification accuracies obtained by the new APAP-based flow pattern identification method are higher than 89.5%.According to good performance in the experiments,feasibility and effectivity of the proposed method and its application potential in flow pattern identification are verified.
Keywords/Search Tags:pattern recognition, clustering analysis, Affinity Propagation, factor graph, classification
PDF Full Text Request
Related items