Font Size: a A A

Research On Ensemble Of Entropy Regularized Soft Subspace Clustering

Posted on:2022-12-12Degree:MasterType:Thesis
Country:ChinaCandidate:M LiFull Text:PDF
GTID:2518306755472684Subject:Information and Post Economy
Abstract/Summary:PDF Full Text Request
Cluster analysis is a classical data partitioning method,but it usually can't get good results on high-dimensional datasets.The reason is that data clusters often only exist in low-dimensional subspaces.Subspace clustering is an effective method to avoid the "curse of dimensionality",and soft subspace clustering is an extension of the traditional feature-weighted clustering algorithm.Entropy regularization soft subspace clustering algorithm,such as ERKM(Entropy Regularization K-Means algorithm),is widely used in high-dimensional data subspace clustering by adding a negative entropy term to the objective function to encourage more feature dimensions to form subspaces.However,the negative entropy coefficient of the entropy regularized soft subspace clustering algorithm needs to be determined in advance,and its value range depends strongly on the problem,and the value of the negative entropy coefficient has a great impact on the clustering results.Therefore,this paper attempts to solve the problem of the influence of negative entropy coefficient from two aspects: one is to transform the problem into a soft subspace clustering problem with entropy constraints;the other is to generate multiple clustering results by using the sensitivity of negative entropy coefficient,and then use the ensemble technology to integrate the results to obtain more accurate and robust clustering results.The main work of this paper is as follows:(1)A soft subspace clustering algorithm with entropy constraints,ERKM+,is proposed for the selection of negative entropy coefficient.Specifically,the negative entropy is moved from the objective function to the constraint condition,and a new parameter is introduced to constrain the negative entropy values.Experiments on the UCI datasets show that this method can improve the clustering performance to a certain extent compared with other algorithms.(2)A hedonic game-based soft subspace clustering ensemble algorithm SSEH(Soft Subspace clustering Ensemble based on Hedonic games)is proposed.The coefficient sensitivity of entropy regularization soft subspace clustering algorithm is used to generate multiple base clusterings,and then we combine the idea of hedonic game with clustering ensemble to make the result of clustering ensemble converges to a Nash stable coalition structure.Considering the impact of cluster quality on the accuracy of clustering results,SSEH uses the ECI(Ensemble-driven Cluster Index)to measure cluster stability in base clusterings,and designs a new measurement to evaluate the similarity of two data points and make the Nash stable coalition structure more reasonable.In order to make the number of clusters obtained by the coalition structure match the actual number of clusters,SSEH merges the coalition in the Nash stable state by minimizing the loss of social welfare.In this paper,the SSEH algorithm is compared with other clustering ensemble algorithms and obtains better results.(3)An algorithm for soft subspace clustering ensemble based on random walk named CERW(Clustering Ensemble based on Random Walk)is proposed.Firstly,the coefficient sensitivity of entropy regularization soft subspace clustering algorithm is used to generate multiple base clusterings.With the clusters generated by hedonic game as nodes,and the social welfare values of clusters are weighted for edges to construct a cluster-level similarity graph.The structural connections between cluster nodes are discovered through random walk trajectories,which lead to further cluster ensemble.Experimental results show that the CERW outperforms the comparison algorithms.Finally,this paper presents a method to retrieve the subspace of the clustering ensemble result,which explains the subspace where the consensus partition is located from the perspective of feature weighting.
Keywords/Search Tags:soft subspace clustering, clustering ensemble, hedonic game, random walk, subspace retrieval
PDF Full Text Request
Related items