Font Size: a A A

Incomplete Multi-view Clustering

Posted on:2020-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:M L HuFull Text:PDF
GTID:2428330590472681Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the increase of diverse data acquisition devices,real data are often with multiple modalities or from multiple heterogeneous sources,forming so-called multi-view data.The machine learning task based on this data is called multi-view learning.As an important paradigm in multi-view learning,multi-view clustering has attracted extensive attention of researchers due to the time-consuming and laborious sample marking.To date,almost all the previous studies assume that views are complete.However,in reality,it is often the case that each view may contain some missing instances.Such incompleteness makes it impossible to directly use traditional multi-view clustering methods.Clustering on such multi-view datasets is called incomplete multi-view clustering and quite challenging.How to reasonably combine the complementary and consistent information between multiple views to reduce the impact of sample missing and improve the clustering effect is the goal of incomplete multi-view clustering.This paper takes this data as the research object,carries on the related clustering algorithm research,and achieves the following research results:?1?Doubly Aligned Incomplete Multi-view Clustering?DAIMC?is proposed.The algorithm is based on weighted Semi-Nonnegative matrix factorization?Semi-NMF?and uses the given sample alignment information to learn a common latent feature matrix for all the views.Furthermore,DAIMC establishes a consensus basis matrix with the help of L2,1-Norm regularized regression for reducing the influence of missing instances.Consequently,compared with existing methods,besides inheriting the strength of Semi-NMF with ability to handle negative entries,DAIMC has two unique advantages:1)solving the incomplete view problem by introducing a respective weight matrix for each view,making it able to easily adapt to the case with more than two views;2)reducing the influence of view incompleteness on clustering by enforcing the basis matrices of individual views being aligned with the help of regression.Experiments on four real-world datasets demonstrate its advantages.?2?One-Pass Incomplete Multi-view Clustering?OPIMC?is proposed.Hitherto,though many incomplete multi-view clustering approaches have been developed,most of them are offline and have high computational and memory costs especially for large scale datasets.To address the problem in this paper,we propose an One-Pass Incomplete Multi-view Clustering?OPIMC?framework.With the help of regularized matrix factorization and weighted matrix factorization,OPIMC can relatively easily deal with such problem.Different from the existing and sole online IMC method,OPIMC can directly get clustering results and effectively determine the termination of iteration process by introducing two global statistics.Finally,extensive experiments conducted on four real datasets demonstrate the efficiency and effectiveness of the proposed OPIMC method.
Keywords/Search Tags:Machine Learning, Multi-view Clustering, Incomplete Multi-view Clustering, Matrix Factorization, Online Learning, One-Pass Learning
PDF Full Text Request
Related items