Font Size: a A A

Research On Clustering Methods For Low-quality Multi-view Data

Posted on:2019-12-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y K YeFull Text:PDF
GTID:1368330611492958Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
For an object,various modalities of data can be acquired from different sources of different aspects,resulting in multi-view data.How to learn from these consistent but complementary multi-view data is an essential research problem.Moreover,in real-world applications,multi-view data are usually noisy or partially missing for many reasons.How to learn from these low-quality multi-view data is crucial for more applications of multi-view learning in real world.This dissertation focus on multi-view clustering,a more challenging task against multi-view classification.More specifically,we focus on two typical types of low-quality multi-view data: multi-view data with incomplete views and multi-view data with noisy views.On the fundamental of the current researches,we propose four effective methods for low-quality multi-view data clustering.The main contributions include:(1)For the multi-view data with incomplete views,we propose a consensus kernel k-means clustering method,which combine imputation and clustering in a unified framework with consideration of multi-view consistency.The proposed method learns a centroid clustering and fills incomplete views.We measure the similarities between centroid clustering and clustering of every view to explicitly model the consistency between views.The proposed method inherits the advantage of the latest work,combining imputation and clustering into a unified framework and making imputation more reasonable by considering both between-view relations and clustering objective.Moreover,by explicitly modeling the multi-view consistency,the learned model fits the inherit nature of multi-view data better.Besides,we design an alternate optimizing algorithm to divide the primal complex optimizing problem into several easily solved problems by only optimizing partial variables each time.Comprehensive experimental results show that,by inheriting the advantage of the current work and adding between-view consistency modeling,the proposed method gains superior clustering performance.(2)For the multi-view data with incomplete views,different from the early fusion methods for incomplete multi-view clustering,we propose a novel and effective late fusion method.According to the timing of information fusion,the clustering methods for complete multi-view data can be divided into two groups: early fusion methods and late fusion methods.Current researches on incomplete multi-view clustering focus on early fusion methods,which perform multi-view information fusion before clustering.Different from early fusion methods,late fusion methods conduct clustering in each view,and then perform information fusion with the clustering results.The advantage of late fusion is that the information fusion is easier.Through experiments,we observe that clustering of the visible instance in incomplete view can reach high accuracy.This indicates that late fusion for incomplete multi-view clustering is possible.However,traditional late fusion methods can not deal with incomplete views.To address this issue,we propose a novel method to perform information fusion of clustering results from incomplete views.First,we encode each view's clustering results.The encoded clustering results can be served as compressed representations of each view.Then we propose an algorithm similar as k-means to search a clustering decision that groups the visible compressed representations in each view well.Like k-means,the initial clustering decision will affect the final clustering performance of the proposed method.Through experimental analysis,we give some reasonable advices for initialization.The sufficient experimental results show that the proposed method can effectively groups the incomplete multi-view data and perform better than the classical early fusion method with suitable initialization.(3)For the multi-view data with noisy views,we propose a multi-view clustering method that automatically set up the weights of views.By adjusting the weights,the proposed method alleviates the negative effects of noisy views in the process of information fusion and clustering.One of the important categories of multi-view clustering methods are the methods that learn a centroid clustering.These methods usually treat each view equally or pre-set the weights of views,which may make the potential noisy views largely effect the final clustering performance.To address this issue,we propose a method that learns centroid clustering and the weights of views simultaneously,avoiding manually setting the weights.We design an algorithm which alternately updates weights of views and the centroid clustering to solve the corresponding optimization problem.Compared to the methods with fixed weights,the proposed method has better performance.We suggest that,by adjusting the weights during the learning process,the potential noisy views get smaller weights,alleviating the negative effects from noisy views and leading to better performance.(4)For the multi-view data with noisy views,we propose another multi-view consensus clustering method.The proposed method learns the consensus clustering structure and the efficient clustering structure of each view after denoising.In the proposed method,the efficient clustering structure of each view is learned from the data of each view and the information of the multi-view consensus clustering structure.For the noisy view,the proposed method assigns smaller weight to alliviate the effect of the noise to learn the efficient clustering structure.The proposed method learn the weights of views,the efficient structures in each views and the consensus clustering structure to achieve optimal multi-view clustering performance.
Keywords/Search Tags:muti-view, clustering, low quality, missing data, incompleteness, noise
PDF Full Text Request
Related items