Font Size: a A A

Clustering Ensemble Based On Densitu Peaks

Posted on:2018-04-22Degree:MasterType:Thesis
Country:ChinaCandidate:R H ChuFull Text:PDF
GTID:2348330515469104Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the internet technology,society is in the era of big data.Human beings produce alot of data in their daily lives.In the future,no matter in which field,more and more decisions will be dependent on data analysis.How to analyze the massive data effectively and to find valuable information hidden in the back of the data become a new focus.Clustering ensemble combines two techniques:clustering and ensemble learning.Using this model to deal with problems can improve the accuracy,robustness,and stability of the final results.The semi-supervised clustering ensemble model can be designed by adding semi-supervised information into the clustering ensemble model in the integration process.Under some certain conditions,the clustering results obtained by this model may be superior to the unsupervised clustering model.In this thesis,the affinity propogation(AP)algorithm is used as the base clustering algorithm to obtain different base clustering results by changing the initial input parameters.Rapid computation of the maximal information coefficient(RapidMic)is introduced to represent the correlation of the base clustering results,expressed by a similarity matrix.The matrix is selected to represent the density relationship of the sample dataset.In this thesis,we use the isometric feature mapping(Isomap)to reduce the dimension,so as to prove that the density relationship of the sample dataset can be revealed by the base clustering results.By improving the density peaks(DP)algorithm,this thesis designs the k_DP algorithm which can automatically select several points with larger density peaks as the clustering centers.Then the clustering ensemble algorithm KDPE is designed according to the idea.The experimental results show that KDPE can achieve better clustering ensemble results compared with several classic models.Finally,this thesis attempts to incorporate semi-supervised information into the proposed model and try to improve its clustering ensemble results.The semi-supervised clustering algorithm SDPE is designed based on semi DP which is improved by the DP algorithm.Through comparing with KDPE,it is found that SDPE can optimize the clustering results and improve the performance of KDPE under certain semi-supervised ratios.
Keywords/Search Tags:clustering ensemble, semi-supervised clustering ensemble, affinity propagation, density peaks, similarity matrix
PDF Full Text Request
Related items