Font Size: a A A

Adaptive Regularized Semi-supervised Clustering Ensemble Study Based On Constraint Selection

Posted on:2021-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:R LuoFull Text:PDF
GTID:2428330611466951Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
At present,the machine learning technology has been applied in various industries,the scale and complexity of data are growing with the rapid development of the Internet in the new era.Efficient data clustering analysis has become an important research content which draw much attention of many researchers in recent years.However,traditional clustering methods still have many shortcomings in improving clustering performance.For example,the generalization performance of the results obtained by using a single clustering algorithm for data analysis is poor and lacks sufficient persuasiveness.Since ensemble learning can integrate multiple single strategies into a comprehensive solution,clustering ensemble method can effectively improve the limitation of single clustering algorithm.Traditional cluster ensemble approaches have two limitations:(1)In the ensemble member generating process,they do not make use of prior knowledge of the datasets given by experts,denoted as must-link constraints and cannot-link constraints.(2)They ignore the negative effects brought about by redundancy and the noise,all the ensemble members are considered,even the ones without positive contributions.In addition,the acquisition of tagged data or the marking of data by a speciallyassigned person requires a high cost of manpower and material resources.,while the unlabeled data is relatively easy to get.In order to make full use of a small amount of labeled data,clustering analysis based on semi-supervised learning is developed.Based on the problems of clustering ensemble and the effectiveness of semi-supervised learning,semi-supervised clustering ensemble method is proposed,and the accuracy,stability and robustness of clustering results can be significantly improved by combining semisupervised learning and integrated learning methods to process the clustering analysis.In order to address the above shortcomings,in this paper we propose an approach to combine multiple semi-supervised clustering solutions via adaptively regularizing the weights of clustering ensemble members,which is referred to as Adaptive Regularized Semi-supervised Clustering Ensemble Method,ARSCE.First,ARSCE method generate a series of feature subspaces by randomly selecting feature without replacement to avoid the scenario where there are two identical feature subspaces.Second,we conduct feature transformation on the above obtained feature subspaces while considering the pairwise constraints to find new clusteringfriendly spaces,where clustering methods are exploited to generate various clustering solutions.Finally,we design a novel fusion strategy to integrate multiple clustering solutions into a unified clustering partition,where weights are reasonably designated for each clustering ensemble member.To verify the clustering performance of the proposed method,extensive experiments are conducted on multiple real-world benchmarks,not only compared with many classical algorithms,but also analyze the parameter sensitivity of the proposed algorithm in detail.Experimental results demonstrate the effectiveness and superiority of our proposed method ARSCE over other counterparts.
Keywords/Search Tags:Clustering ensemble, semi-supervised clustering, constraint selection, clustering fusion
PDF Full Text Request
Related items