Font Size: a A A

Research On Subspace Clustering Algorithms And Its Applications

Posted on:2017-05-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y J XuFull Text:PDF
GTID:2308330488482715Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Cluster analysis is one of crucial research method in the area of data mining. The approach can be summarized as the process which divides the data of set into different categories or clusters according to the different similarity degree. Cluster analysis has been applied to various fields with its characteristic of unsupervised study, including Electronic Commerce, Bioinformatics, Web Log analysis, financial transactions, etc.However, due to the influence of Dimensionality Curse, the accuracy and stability of clustering results will decrease significantly when the traditional clustering algorithms are used to deal with high-dimensional data. In recent years, the problem, which copes with high-dimensional data by the method of cluster analysis, has become a focus and difficulty in the field of artificial intelligence. Regarding this, the concept of Subspace Clustering arises at the historic moment. Its basic thought could be described as a process that feature spaces including all initial data are divided into different feature subsets, and then observes and studies the significance of each data partitioning from different subspaces according to certain rules, simultaneously, searches for the corresponding feature subspace of data. On the high-dimensional dataset, subspace clustering algorithms have obtained relatively satisfying results. For several shortcomings and inadequacies in the two types of subspace clustering algorithms based on membership degree and self-expressive model, our paper proposes some novel subspace clustering algorithms in accordance with the thought of analyzing a number of existing subspace clustering algorithm and making improvement and innovation. The main contents of our paper are outlined as follows.(1) The existing soft subspace clustering algorithm adopts the method which chooses the sample points in the data set randomly as the clustering center, and this process could be easy to fall into local optimum. To solve the above problems, we propose a novel fuzzy clustering algorithm based on the framework of soft subspace clustering, which integrates the QPSO algorithm into gradient descent method to optimize the objective function in soft subspace clustering. This algorithm solves global optimal clustering centers in the subspace by the advantages of global optimal algorithm of QPSO. Simultaneously, the proposed algorithm solves sample points’ fuzzy weights and membership degree by gradient descent method with fast convergence speed. Experimental results carried out on the UCI dataset demonstrate that the proposed method improves the accuracy as well as the stability of the clustering results.(2) The objective function in the conventional soft subspace clustering algorithm is based on Euclidean distance and the problem of Dimensionality Curse, which results in the failure of Euclidean distance metric function, exists in the optimal process with high-dimensional samples. For the shortcoming of simplex Euclideandistance metric objective function, we apply correntropy to soft subspace clustering.In the new objective function, sample points’ fuzzy weights and membership degree matrices could be solved via new update formulas. And then this algorithm solves global optimal clustering centers in the subspace by QPSO. On the UCI dataset, we make some analyses on Rand Index, Normalized mutual information and significance test of algorithm. From the experimental results, our algorithm could obtain better clustering performance.(3) The conventional subspace clustering algorithms based on self-expressive model, which adopt sparse or low rank representations, conduct clustering by considering the errors, noises into their objective functions. Then the similarity matrix is solved via Alternating Direction Method of Multipliers(ADMM). However, these approaches are subject to the restriction that the characteristic of errors and outliers in sample points should be known as the prior information. Furthermore, these algorithms are time-consuming during the iterative process. Motivated by this observation, we propose a novel method which introduces ridge regression as objective function and applies affine criteria into subspace clustering. An analytic solution to the problem has been determined for the coefficient matrix. Experimental results obtained on face datasets demonstrate that the proposed method not only improves the accuracy of the clustering results, but also robustness. Furthermore, the proposed method reduces the computational complexity.
Keywords/Search Tags:Subspace Clustering, QPSO, Correntropy, Gradient Descent, Affine Subspace, Ridge Regression, Analytic Solution
PDF Full Text Request
Related items