Font Size: a A A

Improvement Research Of Clustering Algorithm Based On High-dimensional Data

Posted on:2019-09-10Degree:MasterType:Thesis
Country:ChinaCandidate:J Y JiangFull Text:PDF
GTID:2428330566499391Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The complexity,sparsity and diversity of high-dimensional data restrict the effectiveness of the traditional clustering algorithm.Thus,clustering analysis for high-dimensional data has become one of the most important research direction in the field of data mining.Subspace clustering algorithm is an extension of traditional clustering algorithm in the high-dimensional space.It can realize the high-dimensional data clustering effectively.Sparse subspace clustering algorithm is a kind of subspace clustering algorithm based on spectral clustering method.It is independent on the dimension and the quantity of subspace,and it can deal with noise data and singular points.This thesis improves the clustering algorithms for high-dimensional data respectively from the traditional K-means algorithm and sparse subspace clustering algorithm,on the basis of the existing clustering algorithms,and implements an application.This thesis designs an optimized K-means algorithm——DK-means Algorithm,with the combination of "distance method" and "density method" to determine the initial clustering center.In order to solve the problem of increasing time complexity caused by the addition of additional calculation amount to the DK-means Algorithm,this thesis designs the EDK-means Algorithm with an optimization strategy based on "safe distance",further improving the implementation efficiency of the DK-means Algorithm and the clustering quality.At the same time,by introducing Trace Lasso into the regular term of sparse subspace clustering,a new self-presentation model is designed.Then the EDK-means Algorithm is applied to spectral clustering ro realize the TL-MSR Subspace Clustering Algorithm,which solves the performance problem of clustering algorithm.Experimental verification is finally carried out on the improved clustering algorithm,comparing with the original clustering algorithm.The experimental results show that the improved clustering algorithm is better than original clustering algorithm.Meanwhile,this thesis also designs a prototype system of the clustering algorithm.
Keywords/Search Tags:High-dimensional data, Subspace clustering, K-means algorithm, Sparse subspace clustering, Spectral clustering
PDF Full Text Request
Related items