Font Size: a A A

Research On Deep Clustering Method For Multi-view Data

Posted on:2021-04-07Degree:MasterType:Thesis
Country:ChinaCandidate:W T HuangFull Text:PDF
GTID:2518306470462884Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In the big data era,data are generally obtained from different sources or observed from different views.These data are referred to as multi-view data.In big data mining and analysis,it is very important to fully extract and utilize the inner information in multi-view data.At the same time,these multi-view data present the characteristics of large scale,high dimensions and complex internal structure.How to effectively fuse multi-view data and discover the internal connections between views for efficient learning is a challenging task.This paper studies representation learning,multi-view learning and clustering problems based on deep learning,and proposes improvements based on existing clustering methods and multi-view learning methods.First,this paper proposed a deep clustering method via weighted K-subspace network.By combining deep autoencoder and K-subspace clustering,adding soft labels and regular items to solve the problem of sensitivity to outliers and initialization on K-subspace clustering,and jointly training the model by building a unified framework.This end-to-end method can effectively solve the problem of large-scale data storage and processing time.In addition,by randomly initializing model parameters,reliable K subspaces can be found,which solves the problem of relying on other clustering networks for parameter initialization in deep clustering,which improves the stability of the clustering effect.Secondly,this paper studies a shared generative latent representation learning network for multi-view clustering.It learns the shared latent variables following Gaussian mixture distribution through variational autoencoders,and then performs clustering with shared latent variables.In particular,our model is based on the assumption that despite the differences between views,multi-view data share a common hidden variable,and this hidden variable follows a Gaussian mixture distribution.At the same time,multi-view data can be sampled from a common hidden variable.In order to make better use of multi-view data consistency information,we introduce a set of non-negative combination weights to fuse the hidden space of different views.These weights reflect the contribution of different views to shared hidden variables.The model effectively uses the consistent information in multiple views for clustering in an unsupervised manner and models the generation process.It can not only improve the clustering performance,but also promote the fusion and generation capabilities of multi-view data.In addition,thanks to deep learning batch training and gradient descent optimization methods,the model is easily extended to large-scale data.Finally,we have conducted effective experiments on the improved algorithm proposed in this paper on multiple public benchmark data sets and large-scale data sets to prove the effectiveness of the improved algorithm.
Keywords/Search Tags:multi-view learning, representation learning, deep clustering, autoencoder
PDF Full Text Request
Related items