Font Size: a A A

Research On Multi-view Clustering Method For Incomplete Information

Posted on:2024-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:K W ZhangFull Text:PDF
GTID:2568307055497944Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As one of the most important unsupervised methods,clustering has long been a key technique in pattern recognition and machine learning.Clustering groups data samples according to specific criteria such that the same samples in one group are more similar than samples in different groups.With the rapid development of Internet and communication technology,most real-world data are generated from multiple sources or described by various feature collectors,and this kind of data is called multi-view data.Due to its advantages in partitioning unlabeled multi-view data,multi-view clustering has attracted more and more research attention.However,in practical applications,the multi-view data may lose some views due to the impact of machine damage or sensor problems,which leads to the generation of incomplete multi-view data.The presence of incomplete multiview data makes existing multi-view clustering methods very limited and inapplicable.Therefore,in order to solve the problem of incomplete multi-view data,incomplete multiview clustering methods have been proposed.Aiming at the problem of incomplete multiview clustering,this paper conducts research from two aspects of traditional machine learning and deep learning.The main research contents and innovations are as follows:1.The existing incomplete multi-view clustering algorithms focus on learning consistent representations,ignore the importance of local data structures,and do not consider the different importance of multiple views.To address this issue,this paper proposes an incomplete multi-view clustering algorithm based on low-rank representation and adaptive graph regularization(IMC-LRAGR).In the IMC-LRAGR model,a distance regularization term and non-negativity constraints based on low-rank representations are first combined to learn graphs with global and local data structures.Second,a corresponding low-dimensional representation of the graph is obtained by using spectral clustering.Finally,a new weighted fusion mechanism is introduced in the model to learn a consistent representation of all views,and the K-means algorithm is used to obtain the final clustering results.Extensive experimental results on standard datasets demonstrate that the IMC-LRAGR algorithm outperforms state-of-the-art incomplete multi-view clustering algorithms.2.The deep incomplete multi-view clustering algorithm based on contrastive learning relies on additional projection heads when preventing dimensional collapse,and ignores the conflict between the inconsistency of individual view private information and the common semantic consistency of all views,propose a deep incomplete multi-view clustering algorithm(PDCIMC)that prevents dimensional collapse via direct contrastive learning.PDCIMC is a simple and straightforward incomplete multi-view clustering model that can better utilize the latent representation information of each dimension.Specifically,an autoencoder is first used to learn the feature representation of each view from the original features,and then the sub-vectors of the learned feature vectors are sent to the contrastive loss function,which directly optimizes the representation space and effectively prevents dimensional collapse in the representation space.In order to prevent the performance degradation caused by recovering the inconsistent information of the views,PDCIMC adopts the minimization of conditional entropy to recover the missing views,and the inconsistent information will be discarded skillfully.Since reconstruction and consistency learning directly on latent features can cause the above conflicts,unlike most existing methods that require MLP,we adopt a simpler approach use a simpler approach where reconstruction learning is implemented on latent features and consistency learning is implemented on its sub-vectors.The approach can achieve better utilization of the useful information of the view at the same level of feature representation.Experimental results on publicly available benchmark datasets demonstrate that PDCIMC can achieve optimal clustering results.
Keywords/Search Tags:Incomplete multi-view clustering, Low rank representation, Graph regularization, Contrastive learning, Multi-view representation learning
PDF Full Text Request
Related items