Improvement Research Of Clustering Algorithm Based On High-dimensional Data

Posted on:2019-09-10

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Jiang

Full Text:PDF

GTID:2428330566499391

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

The complexity,sparsity and diversity of high-dimensional data restrict the effectiveness of the traditional clustering algorithm.Thus,clustering analysis for high-dimensional data has become one of the most important research direction in the field of data mining.Subspace clustering algorithm is an extension of traditional clustering algorithm in the high-dimensional space.It can realize the high-dimensional data clustering effectively.Sparse subspace clustering algorithm is a kind of subspace clustering algorithm based on spectral clustering method.It is independent on the dimension and the quantity of subspace,and it can deal with noise data and singular points.This thesis improves the clustering algorithms for high-dimensional data respectively from the traditional K-means algorithm and sparse subspace clustering algorithm,on the basis of the existing clustering algorithms,and implements an application.This thesis designs an optimized K-means algorithm——DK-means Algorithm,with the combination of "distance method" and "density method" to determine the initial clustering center.In order to solve the problem of increasing time complexity caused by the addition of additional calculation amount to the DK-means Algorithm,this thesis designs the EDK-means Algorithm with an optimization strategy based on "safe distance",further improving the implementation efficiency of the DK-means Algorithm and the clustering quality.At the same time,by introducing Trace Lasso into the regular term of sparse subspace clustering,a new self-presentation model is designed.Then the EDK-means Algorithm is applied to spectral clustering ro realize the TL-MSR Subspace Clustering Algorithm,which solves the performance problem of clustering algorithm.Experimental verification is finally carried out on the improved clustering algorithm,comparing with the original clustering algorithm.The experimental results show that the improved clustering algorithm is better than original clustering algorithm.Meanwhile,this thesis also designs a prototype system of the clustering algorithm.

Keywords/Search Tags:

High-dimensional data, Subspace clustering, K-means algorithm, Sparse subspace clustering, Spectral clustering

PDF Full Text Request

Related items

1	Research On Improved Sparse Subspace Clustering Algorithm
2	Research On Key Technologies Of Clustering High-dimensional Data Based On Sparse Subspace And Their Applications
3	Study On High-dimensional Data Subspace Clustering Analysis And Application
4	Research On Sparse Subspace Clustering Algorithm And Theory Under Noise
5	High-dimensional Data Clustering Method Based On Embedded Subspace
6	Research On Sparse Subspace Clustering Models And Algorithms Based On Low-rank Representation
7	Research On Subspace Clustering Algorithm Based On Sparse, Adaptive And Hypergraph
8	Research On Sparse Subspace Clustering And Its Fast Algorithms
9	Research On Improved Subspace Clustering Algorithm
10	Research Of Sparse Subspace Clustering And Its Applications