Font Size: a A A

The Study On Semi-supervised Clustering Ensemble Based On Member Selection

Posted on:2019-12-11Degree:MasterType:Thesis
Country:ChinaCandidate:X G WuFull Text:PDF
GTID:2428330545470257Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Clustering is an important research direction in data mining.However,there is not a clustering algorithm that can be applied to all data sets.Clustering ensemble is the best way to solve above problem.Its main idea is to combine the results of multiple clustering algorithms,so that the final result is better than the result of single clustering algorithm.However,compared with single clustering algorithm,the complexity of clustering ensemble is high,and its result is easily affected by members.And the existing clustering ensemble algorithms do not have a high utilization rate of the constraint information in real life.Through five experiments,this paper makes comparison and analysis of existing clustering ensemble algorithms.Finally it selects k-means and spectral clustering to generate members,normalized cut to combine members,and the number of generated members is 100.Combined with the above content,this paper improves clustering ensemble from member selection and consensus function.The specific work is as follows:Multiple Clustering and Selecting Approaches Based on Direct Combining(MCSDC)and Multiple Clustering and Selecting Approaches Based on Clustering Combining(MCSCC)are proposed in this paper to solve the problem that the existing clustering ensemble algorithms do not consider quality and diversity simultaneously.MCSDC uses four clustering algorithms to group members according to diversity,selects member that has the highest quality in each group and combines them directly to get the finally selected members.Based on the selected members of MCSDC,MCSCC uses k-means clustering and selecting method to get the finally selected members.The experiment results show that the two member selection algorithms we proposed are better than other algorithms.Based on MCSDC,this paper uses constraint infonnation in consensus function and proposes Semi-supervised Selective Clustering Ensemble Based on Chameleon(SSCEC)and Semi-supervised Selective Clustering Ensemble Based on Ncut(SSCEN)to solve the problem that the existing clustering ensemble algorithms do not have a high utilization rate of constraint infonnation.SSCEC uses chameleon algorithm as consensus function,and uses constraint infonnation in sub-graph partition and sub-graph merging.SSCEN uses normalized cut as consensus function,and uses constraint information in the process of cutting graph.The experimental results show that the two semi-supervised member selection clustering ensemble algorithms we proposed are better than other semi-supervised clustering ensemble algorithms.
Keywords/Search Tags:clustering ensemble, generative mechanism, consensus function, member selection, constraint information
PDF Full Text Request
Related items