Font Size: a A A

Research On Spectral Clustering Algorithm Based On Feature Selection

Posted on:2022-11-30Degree:MasterType:Thesis
Country:ChinaCandidate:Q M LuoFull Text:PDF
GTID:2518306770471794Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
With the rapid development of society and the advancement of science and technology,the number of people using the Internet has increased dramatically,resulting in a huge amount of data,and even a trend of skyrocketing,which provides a large amount of data for the development of big data.However,there are many redundant data or high-dimensional data in these data,which cannot be used effectively.Therefore,how to effectively utilize these data becomes a huge challenge.And due to many existing algorithms,it takes a lot of time to select appropriate hyperparameters,that is the time cost of hyperparameter adjustment is large.In addition,there are many spectral clustering algorithms has not noticed the problem of unbalanced clustering results.Therefore,this paper explores an algorithm to avoid the situation that the number of clustering results is very small or very large as much as possible.Based on the above problems,this paper proposes two effective spectral clustering algorithms to solve the problem of high-dimensional data and the extremely unbalanced of data clustering results based on traditional clustering algorithm,with using feature selection,balance constraint,local structure learning,K-nearest neighbor,subspace learning and spectral graph theory properly.The main work of this paper is as follows:The first algorithm proposed in this paper is Spectral Clustering Algorithm Based on Feature Selection for Hyper-parameter Self-tuning.In this algorithm,a hyperparameter is reduced,then it can reduce the adjustment time of the hyperparameter,and the regular term is introduced to select the features to reduce the influence of noise and redundant features.Finally,the locality preserving projection method based on the idea of graph laplacian is introduced to preserve the local structure of the sample.The results on several basic data sets show that the proposed algorithm is superior to the comparison algorithm.The second algorithm is Spectral Clustering Algorithm Based on Feature Selection and Balanced Constraint.The algorithm using features selection method to reduce the influence of noise and redundant features,using locality preserving projection method to keep the local structure of samples,then the balanced regular term is introduced to obtain as balanced clustering results as possible,so that the algorithm divides the number of clusters as balanced as possible while the features are reduced,and finally uses the least squares method to calculate the sample error.A large number of experimental results show that the values of the proposed algorithm in clustering indexes are better than the comparison algorithm in most cases,which proves the effectiveness of the proposed spectral clustering algorithm.The two algorithms proposed in this paper are firstly aimed at the waste of time caused by manual adjustment of hyperparameters of spectral clustering algorithm in traditional machine learning algorithm,and the second is aimed at the unbalanced situation that the number of cluster samples sometimes appears too many or too few.The experimental results show that most of the results of the two algorithms proposed in this paper are better than the other comparison algorithms.In the future scientific research,we will consider the multidimension problem of samples for cluster analysis and apply the algorithm to more practical scenarios.
Keywords/Search Tags:Spectral clustering, Balance constraint, Feature selection, Locality preserving projection
PDF Full Text Request
Related items