Font Size: a A A

Research On Clustering Ensemble Algorithm Based On Weight Designing

Posted on:2016-07-05Degree:MasterType:Thesis
Country:ChinaCandidate:J WangFull Text:PDF
GTID:2308330479484815Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data, more and more information is hidden in the growing massive data. It becomes more important and difficult for us to obtain the useful information and knowledge. As an important technology in the field of data mining, clustering analysis is drawing more and more attention and getting more application. However, the single clustering algorithm always has a variety of problems, when it faces specific application problems. The stability and accuracy issues of clustering need study deeply.Clustering ensemble algorithm was put forward and researched to explore the problem in clustering analysis at some extent. By using integrated learning technologies, Clustering ensemble obtain different results of clustering members, which generated by different algorithms or the same algorithm using different parameters. Then we use consensus function to deal with the results. The result of clustering ensemble algorithm can often be more advantageous than it in independent clustering algorithm. This paper has systematically analyzed the knowledge in the field of the clustering analysis, and has studied the basic principles and methods of clustering ensemble algorithm. Nowadays, there are many different clustering ensembles, but most of them ignore the quality of clustering members. The final result will be effected if there are some bad-quality or noise-disturbing member.For the above problems, we proposed a clustering ensemble algorithm based on weight designing. It improve the existing algorithms to get better clustering results by adding weighting. The main contents are as follows:①By studying the clustering ensemble methods and the So A-WEB algorithm, we analyzed the problems in the algorithm. It build the decision system between clustering members and the first ensemble results, then use the significance measures of clustering members design the weight, using the same consensus function obtain the final results. However, if there are some bad-quality clustering members, the first ensemble results will be bad too. That also affect the final results. So we put forward a mutual information weighted on the clustering ensemble based on significance of attribute(MI-So A-WCE algorithm). Then we analyzed the process of weight designing. The algorithm choose some good-quality clustering members after calculating the quality of them. Then analyze the differences between them and go on the subsequent processing.②We designed and realized the MI-So A-WEB algorithm. Then we analyzed the different results of the CSPA algorithm, the So A-WCE algorithm and the MI-So A-WEB algorithm on five groups of datasets by used the F-measure evaluation indicator. We also added noise to test the against-noise ability of the algorithm. Experimental results show that the weight designing get better results than others at different situations. By adding weight designing the clustering ensemble algorithm has get better result.
Keywords/Search Tags:Data Mining, Cluster Analysis, Cluster Ensemble, Weight Designing
PDF Full Text Request
Related items