Font Size: a A A

Researches On Clustering With Multi-view And Incomplete Features

Posted on:2020-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:S W WangFull Text:PDF
GTID:2518306548994659Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Clustering is one of the most fundamental researches in machine learning and data mining communities.The aim of clustering is to automatically extract information from massive unlabeled data so that similar data are gathered together and dissimilar samples are separated.In recent years,the research on clustering with multi-view features and incomplete features has attracted more and more attention,and has gradually become a heated topic.The existing multi-view clustering methods mainly follow feature or similarity fusion manner.As a result,the optimization steps are complicated and the time complexities are relatively high.Meanwhile,existing incomplete clustering algorithms separate the missing value imputing and clustering process,and often can not obtain satisfactory clustering results.In order to solve the above issues,we propose efficient multi-view clustering based on late fusion and alignment maximization and a k-means filling for incomplete features.The contributions of our work can be summarized in three aspects:(1)An efficient multi-view clustering algorithm framework(MKKM-LF)based on late fusion is proposed.The framework generates various clustering partition matrices from different views and fuses into an consensus cluster partition.To the best of our knowledge,this is the first time that late fusion is used for multi-kernel methods to enhance the diversity of clustering results while greatly reducing the time complexity of traditional multi-kernel clustering algorithms.In order to implement the proposed late fusion framework,two novel algorithms with average and adaptive weights are proposed to solve the proposed multi-kernel k-means clustering optimization problem with proved convergence.In addition,we theoretically and experimentally prove that the time complexities of the two algorithms increase linearly with the number of samples,which makes our algorithm more practical in real applications.As demonstrated by the experiments on six benchmark datasets,our algorithms achieve comparable or better clustering performance to state-of-the-art ones with less time cost,which demonstrates the advantages of the late fusion in multiple kernel k-means..(2)We propose MVC-LFA to solve multi-view clustering based on late fusion alignment maximization.We theoretically demonstrate that maximizing the alignment between base partition matrices and optimal clustering partition is conceptually equivalent to minimize the loss function of existing k-means algorithm.Therefore,the proposed late fusion alignment maximization can not only incorporates base partitions with the optimal partition,but also contributes to improvements of clustering performance.To the best of our knowledge,MVC-LFA is the first attempt to solve the multi-view clustering problem by maximizing the alignment between consensus clustering partition and the weighted base partitions.In order to solve the optimization function efficiently,an alternate optimization algorithm is derived with both theoretically and experimentally proved convergence.Compared with the existing multi-view clustering method,MVC-LFA shows better clustering performance and lower time complexity on the datasets.(3)We propose an incomplete clustering method based on k means(k-means Filling).Different from existing algorithms separating the imputing and clustering learning procedures,the algorithm unifies the two processes into one optimization goal.As indicated,missing features are alternately estimated for better serving for clustering,while existing data that have been observed remain unchanged throughout the process.In addition,we design an alternate algorithm with fast convergence to solve the optimization problem.Extensive experiments are conducted on nine UCI standard datasets and some large practical applications.Compared with the existing commonly used incomplete clustering methods,the proposed algorithm always achieves better performance,which clearly provesits effectiveness.
Keywords/Search Tags:Multi-view Clustering, Multiple Kernel Clustering, Incomplete Clustering
PDF Full Text Request
Related items