Font Size: a A A

Study Of Data Mining Algorithms Based On Spectral Clustering

Posted on:2019-04-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y X WangFull Text:PDF
GTID:2428330572952234Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the coming of information age,the amount of information has grown as faster as an exponential explosion in various fields.It has become a research hotspot in the field of intelligent information processing that how to exploit potential regularity efficiently from mass data.Clustering has become more important as one of the main methods in data mining.The clustering algorithms don't need tagging manually,and it can mine category information according to the attributes and characteristics of the data.As a branch of the clustering algorithms,the spectral clustering algorithm can deal with non-convex data and solve the problem that the traditional clustering algorithm easily converges to the local optimal solution.However,in practice,the spectral clustering algorithm has high complexity and is limited by unsupervised learning.The performance of the algorithm needs to be improved.In this thesis,the sparse representation-based spectral clustering method is mainly studied,and an implicit constrained expansion algorithm based on strongly connected components is proposed.Further,a semi-supervised spectral clustering algorithm based on implicit constraint expansion is proposed.The main research contents and achievements are as follows.1.When selecting landmarks,it would be affected by data distribution and noise,which may cause the problem of uneven selection.A spectral clustering algorithm based on fast selection of landmark points is proposed in this thesis.The algorithm selects representative data in the clusters by iterating from the candidate set.This solves the problem of the non-uniformity of the selected landmarks and overcomes the problem of sparse representation errors.Experiments show that the proposed algorithm has better clustering accuracy.2.With regard to the limitation of the extension of pairwise constraints,an implicitly constrained expansion algorithm(TEC)based on strongly connected components is proposed in this thesis.The algorithm computes strongly connected components of undirected graphs and propagates the pairwise constraints based on implicit constraints.Step by step,the strongly connected components satisfying the extension of the implicit constraint are filtered and the constraint extension is realized.The experimental results show that the proposed algorithm can extend more supervision information,and it has fast expansion of implicit constraints.3.For existing semi-supervised spectral clustering algorithms can only fuse the insufficiency of partial pairwise constraint information.This thesis solves the problem from two aspects.On the one hand,based on the fusion of implicitly constrained supervision information in sparse representation matrix,an algorithm called sparse representation spectral clustering algorithm(LSC-EC)based on implicit constraints propagation.On the other hand,the implicit constraint is merged in the partition cost function,and the Constrained One Spectral Clustering based on the implicit constraint Extension(ECOSC)is proposed.LSC-EC extends the implicit constrains of strongly connected components to obtain more accurate connected regions and uses the supervisory information to update the sparse representation matrix.ECOSC add more constraint information in the partition cost function by contraints propagation,and convert it into a continuous optimization problem.The experimental results show that LSC-EC and ECOSC can achieve better clustering results,which proves that implicit constraint extension plays a guiding role in clustering to a certain extent.
Keywords/Search Tags:Spectral Clustering, Sparse Representation, Pairwise Constraints, Constraint Propagation, Semi-Supervised Clustering
PDF Full Text Request
Related items