Font Size: a A A

Research On Multiview Learning Method And Its Application In High Dimensional Incomplete Data Clustering

Posted on:2022-11-26Degree:MasterType:Thesis
Country:ChinaCandidate:J J ChenFull Text:PDF
GTID:2518306614459814Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
Data clustering is one of the important research branches in the field of data mining,and it is a method of classifying data without label information.Since there is no guidance and supervision of labeled information,the current mainstream clustering idea is to first use the inherent interrelationship of the data to effectively express the low-dimensional unsupervised learning data to improve the ability to distinguish between different clusters of data,and then combine these low-dimensional Indicates that it is sent to the classic clustering algorithm to obtain the clustering result.Traditional low-dimensional representation learning models mostly target data with a single view.However,the one-sided data description of the single view cannot fully display the potential ability of the learning model,which in turn affects the performance of subsequent clustering tasks.With the explosive growth trend of data,the expression of data is constantly changing,resulting in a data description method called multi-view.Multi-view data means that the same object can be described in different forms.Each form is called a view of the data.This type of description form contains multiple information about the data at multiple levels and can express the data more comprehensively.Therefore,the use of multi-view data for low-dimensional representation learning can solve the limitation of traditional single-view data on the model learning ability,and obtain a low-dimensional representation of data with complete characterization.However,in the real environment,many objective factors often lead to incomplete phenomena such as noise interference or lack of features in the acquired view data,resulting in the performance degradation of existing multi-view learning methods and their clustering applications.To face this problem,this paper focuses on the representation learning of multi-view data and its clustering application,and proposes three multi-view learning methods for high-dimensional incomplete data.The specific research content includes the following aspects:(1)A robust multiview learning method based on similarity learning is proposed.In this method,a low rank constraint is introduced into the data representation to compensate for the interference of noise and outliers,learn a data copy close to the real data structure,and use the copy to learn the robust graph.In addition,a multi view scheme is designed to obtain the consistent similarity of all view graphs by dynamically learning graphs from all views.At the same time,consistent similarity can also be used to disseminate potential information from other views,so as to promote the learning of each view graph.Finally,the above two processes are combined into a unified objective function,and the global optimal solution is obtained by alternating optimization.Experimental results on four public data sets show that the proposed method is superior to most existing methods in similarity learning,and has strong robustness to incomplete data.(2)A reliable multi view learning method based on low rank tensor is proposed.Considering the incomplete data learning problem caused by feature deletion,on the one hand,the method puts the data compensation model and graph learning into a unified framework,uses the data compensation model to recover the defective data,realizes learning the nearest neighbor relationship between sample pairs from the reconstructed data,and makes up for the impact of feature deletion on the original distribution of data.On the other hand,in order to make use of the multi view information of the data and maintain the two-dimensional structure of the nearest neighbor graph at the same time,tensor analysis is introduced to construct the fusion graph learning constraint based on multi view,so as to further capture the high-order potential correlation between the nearest neighbor graphs under different views.In addition,an effective numerical scheme is designed to solve the proposed objective function and ensure the convergence of the objective function.The experimental results of multi view clustering on two incomplete data show that this method is superior to the current mainstream multi view clustering methods in many performance indexes and robustness.(3)A collaborative multi view learning method based on incomplete data repair in double graph is proposed.Different from the current mainstream incomplete data recovery methods,this method uses the consistency and complementarity of multi view data to directly recover the data values of incomplete data from the data point of view,so that the data used in the subsequent clustering process is complete and contains a large amount of effective information.At the same time,multi-core cooperative training is used to learn the robust representation of data,and low rank tensor constraints are introduced to promote multi view fusion,so that the fusion graph for clustering covers more high-order correlations hidden in multi view data.Put the above process into a joint learning framework,so that variables can promote each other and spread effective information in the iterative process.In addition,in order to effectively solve the proposed method,an alternative optimization solution is designed.Experimental results on four visual data sets show that this method has obvious advantages in incomplete data clustering.
Keywords/Search Tags:Incomplete data, Multiview learning, Tensor analysis, Low-rank representation, Spectral clustering
PDF Full Text Request
Related items