Font Size: a A A

The Research Of Incomplete Multi-View Clustering Algorithm Based On Matrix Factorization

Posted on:2022-05-19Degree:MasterType:Thesis
Country:ChinaCandidate:G L NiuFull Text:PDF
GTID:2518306605471214Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
With the rapid development of data storage and processing technology,many types of data have emerged,including text data,picture data,web data,and so on.Some of the data are observed from different sources or from different views,which are called multi-view data.However,due to various errors and failures in actual collection and preprocessing,many of the views that we get are often incomplete.Clustering analysis is an important operation in data mining,so the clustering of incomplete multi-view data has become an important research.In recent years,researchers have come up with various clustering methods based on different models.Although the methods are effective for clustering incomplete multiview data,there are yet some deficiencies and areas for improvement in the incomplete multi-view clustering algorithm based on matrix factorization,so further improvement and research are needed.At present,the incomplete multi-view clustering algorithm based on matrix factorization mainly has three shortcomings as follows:(1)They separate the representation of incomplete multi-view data from the clustering process,without fully thinking about the connection between the representation process and the clustering process,which affects the performance of clustering.(2)In the previous algorithms,most of them assumed that the data and the representation matrix were linear to learn clustering,and unable to deal with the nonlinear structural relations.(3)The consistent representation matrix is obtained through consistency learning,and K-means is usually used to cluster the matrix,but the sensitivity to the initial value.According to the above problems,this thesis proposes two effective methods to enhance the clustering results of incomplete multi-view data.First of all,according to the aspects of(1)and(3),this thesis proposes an incomplete multiview subspace clustering based on low-rank matrix factorization algorithm(IMSC-LRMF).Firstly,IMSC-LRMF uses a low-rank matrix factorization method to obtain a consistent representation matrix.Secondly,a new clustering method is proposed by combining selfexpressiveness learning with non-negative embedding and spectral embedding.Thirdly,the two processes are combined so that the representation learning and clustering processes are under the uniform optimization process and do not require reprocessing(e.g.,K-means),thus avoiding the defect of the sensitive to initial value.Secondly,for the aspect of(2),this thesis proposes an incomplete multi-view clustering based on non-negative embedding and spectral embedding algorithm(IMC-NESE).IMCNESE first constructs the similarity matrix of each view,and then uses the means to fill the complete similarity matrix.Then,the multi-view clustering method constructed by symmetric non-negative matrix factorization and spectral clustering is used for clustering.Finally,the initialized similarity matrix and clustering are combined in a unified objective function to optimize each other until convergence.The algorithm reveals the nonlinear structure and can obtain the clustering results directly without post-processing.Finally,this thesis analyzes the clustering performance,convergence and parameter sensitivity of the two algorithms.The experimental results indicate the two methods can enhance the clustering performance of incomplete multi-view data.
Keywords/Search Tags:Incomplete multi-view clustering, Low-rank matrix factorization, Symmetric non-negative matrix factorization, Non-negative embedding, Spectral embedding
PDF Full Text Request
Related items