Font Size: a A A

Research On Key Technologies Of Clustering High-dimensional Data Based On Sparse Subspace And Their Applications

Posted on:2017-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:Q J HuangFull Text:PDF
GTID:2348330485484559Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the constant advancement of society, computer and internet technology hit another height, which leads to the enrichment and complexity of data – one of the hot spots of people's analysis. Among all the focuses, high-dimensional data clustering tends to be a key researching area.The curse of dimensionality brought by high-dimensional data prevents researchers getting an ideal result, which arouses a method called clustering of high-dimensional data, mainly includes feature selection or feature extraction techniques and subspace clustering methods. Since feature selection or feature extraction techniques cannot ensure a complete original data information, researches based on subspace clustering methods become chief researching direction. These methods have various ways on the basis of searching data, such as CLIQUE?PROCLU, both of which employ certain searching strategies to pick up clustering data in subspaces. However, since data distribution is not consistent, these methods still fail to satisfy researchers.This thesis focuses on sparse subspace clustering. Based on the self-expression of data, it transforms high-dimensional data clustering into a graph partitioning problem by making use of sparse optimization. Graph partitioning problem can be solved by spectral algorithm, which is a robust method.This thesis will firstly introduce some classical clustering methods ?high-dimensional data clustering methods and subspace clustering methods. Then we pay special attention to the principles and whole algorithmic of sparse subspace clustering. Finally, extract the image feature based on Local Binary pattern(LBP) correlation methods, and then cluster the feature set by sparse subspace clustering.Major achievements and innovation:1. Improvement of SSC based on normalized Laplacians matrix rwL. The improvement here refers to decrease of time complexity of SSC and increase of high-dimensional data clustering speed.2. Setting the K-means initial cluster centers based on weight matrix W, which improved SSC. By making use of the property of weight matrix W which is got from solving sparse optimization, we set a comparatively reasonable initial cluster center when K-means serves for the final process of SSC. Thus partial optimum or unstable clustering result can be avoided when K-means is applied. Therefore, clustering accuracy is improved.3. Proposal of texture clustering method based on multi-scale LBPROT and SSC. This method, first of all, employs multi-scale LBPROT, extracting the texture image high-dimensional feature. Then the SSC improved by this thesis is applied to cluster high-dimensional features, which includes lots of texture information. The method given by this thesis get a better clustering result.4. Proposal of face clustering method based on dividing facial image into local regions and LBP “uniform” patterns. This method, first of all, employs LBP uniform patterns, extracting facial local regions texture descriptors which are extracted from each region independently. The descriptors are then concatenated to form a global description of face. This global descriptor encodes both the appearance and the spatial relations of facial regions. Then the SSC improved by this thesis is applied to cluster the global descriptors. The face clustering methods given by this thesis can get a better clustering result.
Keywords/Search Tags:High-dimensional Data Clustering, Sparse Subspace Clustering(SSC), Texture Clustering, Face Clustering
PDF Full Text Request
Related items