Font Size: a A A

Partial Multi-View Clustering Based On Sparse Embedding Framework

Posted on:2020-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:M S JiFull Text:PDF
GTID:2428330578455268Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of computer technology,the types of data collection channels or feature extractors are increasingly diverse,which allow the same object to be described from different levels,and generate multiple views to form multi-view data.Because of occlusion and instrument damage,etc,these views are often incomplete.Therefore,how to deal with incomplete multi-view data and to mine the shared information of the data,and use the consistency principle and complementary principle of multi-view learning to perform multi-view clustering has attracted wide attention in the field of machine learning.At present,for incomplete multi-view data,most of the existing methods are based on Non-negative Matrix Factorization to obtain the common representation of the original incomplete data,and then perform k-means clustering to obtain the result.These "two-stage" multi-view clustering methods do not consider the relationship between incomplete multi-view data processing and clustering,that is,the clustering requirements are not considered in the data processing stage,which makes the performance of the existing methods further improved.In addition,the clustering methods based on non-negative matrix factorization to solve the missing problem do not consider the sparseness degree of coefficient and unsatisfactory basic matrices learned by NMF.Besides,most existing clustering methods only carry out dimensionality reduction before learning the models,which is unable to make full use of the discriminative information in raw data.In summary,the paper proposes a new partial multi-view clustering method based on sparse embedded framework to cluster incomplete multi-view data.This method can also handle incomplete multi-view data well and obtain good clustering performance without completion.The paper starts with the incomplete multi-view data and the serious "Curse of Dimensionality" brought by the era of big data,and studies how to combine Sparse Representation and Principal Component Analysis to improve the performance of incomplete multi-view clustering.The main research contents are as follows:a)Different from the traditional completion methods,the paper performs highperformance multi-view clustering tasks without completion.We embed incomplete multi-view data into a low-dimensional space,so that the loss of data information after dimension reduction is as small as possible,where learn the dictionaries,sparse representation and projection matrix corresponding to different views,and then paired samples and unpaired samples are clustered using the Hungarian algorithm.b)By combining PCA and SR,we jointly learn the projection matrix and dictionary.Constraining the orthogonality of the projection matrix from the original space to the low-dimensional space can promote the sparsity of the low-dimensional space,and preserve as more as possible useful information in the original space,which provides basis for the discriminative ability of the dictionaries.c)During dictionary learning stage,we adopt Fisher discrimination criterion to dictionaries rather than sparse coefficients,which makes the learned dictionary more discriminating than the dictionary learned by traditional methods and makes learned dictionaries better represent raw data.In the paper,synthetic dataset,the extended Yale B dataset,MNISIT Handwritten Digit dataset and a large dataset-Caltech101 are employed to perform clustering tasks,experimental results show that the proposed method achieved better clustering performance than other state-of-the-art clustering methods.
Keywords/Search Tags:Incomplete multi-view data, multi-view clustering, Sparse embedding, Fisher discrimination analysis, dimension reduction
PDF Full Text Request
Related items