Font Size: a A A

Research On Clustering Algorithm Based On Multi-view Data

Posted on:2022-12-20Degree:MasterType:Thesis
Country:ChinaCandidate:M Q GuFull Text:PDF
GTID:2518306755973159Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet,the way people obtain data has gradually evolved from single view to multi-view.Therefore,multi-view learning has become a research point in the fields of artificial intelligence,machine learning and so on.As a main research direction of multi-view learning task,multi-view clustering analysis has developed rapidly and made great progress.Multi-view data can be described from different sources,structures or perspectives,which often has different feature expressions,structures or dimensions,and there are various relationships among views,such as correlation,consistency and complementarity.Compared with single view data,multi-view data often have consistent information and diversity information.How to separate data consistency and diversity from multi-view data is one of the problems to be solved in this paper.In addition,due to the noise and redundancy in feature information,how to mine effective discriminant features from different view data is also the problem to be solved in this paper.The solution to these problems is conducive to the improvement of clustering performance.To solve the above problems,the paper proposes two methods,one is diversity and consistency learning guided matrix factorization for multi-view clustering,and the other is unsupervised linear discriminant analysis for multi-view clustering.The main contributions are as follows:(1)Most of existing multi-view clustering methods only constrain consistency in the data space,but not consider the diversity and consistency in the label space.However,the wrong labels will be generated when we ignore the impacts of diversity in the clustering matrix.To incorporate the effects of diversity in the label space,a novel multi-view clustering method is proposed.In the label space,the common label matrix is relaxed into consistent part and diverse part,which is integrated into the model based on multi-view k-means matrix factorization.Meanwhile,each view is weighted by using a self-weight strategy in the data space.In addition,the original data contains redundant information and noise.To avoid dimensional disaster problems,the original data is projected into a lower dimensional space for dimensionality reduction.An augmented lagrangian multiplier with alternating direction minimization-based optimization solution can guarantee the convergence of our method.Finally,experimental results on six publicly real multi-view datasets are conducted to demonstrate the effectiveness of the proposed method.(2)Aiming at the problem of mining effective discriminant information from different view data,Linear Discriminant Analysis(LDA)is a classical supervised learning method in machine learning research,which can project data into low-dimensional space for feature extraction.But LDA is often used to process single view data,using it directly on multiview data often does not achieve desired performance.To overcome the issue,we propose a novel multi-view clustering method,which can find the clustering labels of data samples by implementing sparse subspace learning,then LDA is used to find discriminant features through pseudo label,the two subtasks are iterated and optimized to acquire a pure label.In addition,the subspace representation matrix is reorganized into a third-order tensor,and a weighted tensor nuclear norm is introduced to obtain the high-order consistency information between multiple views.The optimized representation matrices are used to obtain the common indicator matrix by using spectral clustering.Furthermore,subspace learning and features projection can be naturally formulated as manifold regularization terms,which makes adjacent points in the original space come closer to each other in the low-dimensional space to preserve the neighborhood structure.An Alternating Direction Method of Multipliers(ADMM)based optimization solution can guarantee the convergence of the proposed method.Extensive experimental results on different datasets validate the effectiveness of the proposed method.
Keywords/Search Tags:clustering, multi-view learning, matrix factorization, linear discriminant analysis, sparse representation
PDF Full Text Request
Related items