Font Size: a A A

Research On Weighted Cluster Ensemble Algorithm Based On Validity Evaluation

Posted on:2019-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:J GaoFull Text:PDF
GTID:2428330551458744Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Cluster analysis,as an important branch of data mining technology,has been widely used in many fields such as machine learning,pattern recognition,and information retrieval.In recent years,researchers have proposed many clustering algorithms based on different data types.However,these algorithms cannot handle some data sets well,such as its distribution type of complex,large amount of data,data sets contain isolated points,etc.Compared with single clustering algorithm,cluster ensemble algorithm has stronger robustness,stability,parallelism,noise insensitivity and other characteristics,and has achieved good ensemble results in all areas of the data set.Therefore,the study of cluster ensemble algorithms has attracted widespread attention.This paper studies some problems in cluster ensemble algorithms.For example,most cluster ensemble algorithms treat the base clustering results equally,and the ensemble results are easily affected by low-quality cluster members.Based on the above analysis,this paper proposes two new weighted cluster ensemble algorithms based on the cluster validity function.The main research results of the paper are as follows.(1)A weighted cluster ensemble algorithm based on a single cluster validity function is proposed.This algorithm fuses the existing cluster validity indices and cluster diversity indices to construct a new evaluation index.Based on this indicator,a new weighted cluster ensemble algorithm is designed.The proposed algorithm is compared with the existing cluster ensemble algorithms on the UCI dataset.The experimental results show that the new algorithm reduces the impact of base clustering on the ensemble results by weighting,and can improve the clustering effectiveness of ensemble results.(2)A weighted cluster ensemble algorithm based on multiple cluster validity functions is proposed.The method uses multiple cluster validity indicators to evaluate the quality of the clustering results,and then weighted fusion based on the spatial similarity matrix of the data sets.Based on this,a second-weighted clustering ensemble algorithm was designed.The experimental analysis on the UCI dataset shows that the new method can evaluate the quality of the clustering more effectively and the accuracy of the clustering ensemble compared with the weighted clustering ensemble algorithm of the single validity function.In short,this paper studies the cluster ensemble algorithm from the perspective of cluster validity,and validates the effectiveness of the proposed algorithm by a large number of experiments on the data set.The research in this paper provides a new method for data analysis and has a good practical value in data mining and other fields.
Keywords/Search Tags:Cluster analysis, Clustering ensemble, Clustering validity function, Base clustering evaluation
PDF Full Text Request
Related items