Font Size: a A A

Research On Capped Robust Clustering Algorithm

Posted on:2019-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:T ZhangFull Text:PDF
GTID:2428330569479282Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development and widespread application of information technology,various industries have accumulated a large amount of data.How to effectively deal with these data and obtain potential and useful information from it is an important research area at present.Cluster analysis is becoming more and more important as one of the important tools in the data mining field.So far,researchers have proposed many clustering algorithms,and are widely used in image processing,pattern recognition,natural language processing and other fields.However,the traditional clustering algorithm has some shortcomings,such as being sensitive to outliers,poor robustness,etc.And the clustering performance needs to be further improved.This thesis combines the capped function with the clustering algorithm to improve the robustness and clustering performance of the clustering algorithm.The main research work of this thesis is as follows:1.This thesis proposed a capped robust K-means algorithm(CRK-means).For the disadvantage of the traditional K-means clustering algorithm which is sensitive to outliers,the proposed method adds the idea of capped function,and introduces the denoising value factor into the objective function.At the same time,by adding auxiliary variables during the objective function solving process.In each iteration,the dynamic update of sample weight values is implemented,thereby reducing the impact of outliers on the algorithm and improving the robustness and accuracy of the algorithm.2.The capped robust subspace clustering Algorithm(CRSC)is proposed.In order to solve the problem of subspace clustering is sensitive to noise,this thesis combines the capped function to punish the noise term,which reduces the influence of outliers in the data sample on the construction coefficient matrix and improves the robustness of the algorithm.At the same time,this method self-represents all samples from the correlation between samples and adds local similarity constraints.While ensuring global constraints,it strengthens the expression of local structures among samples.This method can obtain a better affinity matrix,so as to obtain more robust clustering results.3.Experiments on the capped robust K-means algorithm and capped robust subspace clustering Algorithm are studied in this thesis.By choosing UCI standard data sets,synthetic data sets and image data sets,experimental analysis and parameter analysis of the proposed algorithms are performed.The results show that the CRK-means algorithm and CRSC algorithm presented in this thesis can improve the robustness of the algorithm and improve the clustering accuracy.
Keywords/Search Tags:Clustering, Robust, K-means Algorithm, Subspace Clustering, Capped Norm, Noise Data
PDF Full Text Request
Related items