Font Size: a A A

Research And Application Of Clustering Algorithm For Multi-View Data

Posted on:2023-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:R MaFull Text:PDF
GTID:2568306818497014Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
As a relatively mature unsupervised analysis method,cluster analysis has been widely used in data mining,pattern recognition,image segmentation and other research fields.However,most traditional clustering methods are currently only implemented based on single view data.In many practical application scenarios of data mining,the phenomenon of multi-view with complementary and consistent information between independent view data is common.In addition,with the continuous growth of data scale,how to better handle large-scale highdimensional data has become an urgent problem to be solved.To this end,this paper focuses on the clustering algorithm for large-scale data in multi-view scenarios,using the heterogeneity among multi-views and their potential connections to obtain more complete data underlying structure information,thereby improving the clustering ability.At the same time,this paper improves the efficiency of multi-view clustering algorithm by using landmark sampling and other techniques,and applies the improved algorithm to the field of complex network community discovery with multi-view structure.The main research contents include the following three aspects:(1)Aiming at the problem that most traditional multi-view subspace clustering algorithms only focus on shallow clustering,cannot fully capture the deep structure information of data,and have no in-depth research on the self-representation level of data,this chapter proposes a deep multi-view subspace clustering algorithm that introduces exclusive constraints.The model utilizes a deep autoencoder to perform nonlinear low-dimensional subspace mapping on each view to capture the deep structure of the raw data.The exclusive constraint is introduced into the self-representation matrix located in the middle layer of the multi-view data to better preserve the local properties,so that the complementarity and consistency information of the multi-view data can be preserved in the multi-view consensus self-representation matrix.This section uses a joint learning framework to iteratively update autoencoder parameters and clustering parameters to improve clustering performance.Experiments on multi-view datasets show that the method can better mine the underlying complementary structure of multi-view data,thereby improving clustering accuracy.(2)Aiming at the phenomenon that most of the existing multi-view clustering methods only focus on the accuracy of clustering and not on the efficiency of the algorithm,so it is difficult to apply to large-scale data,this chapter proposes a fast multi-view clustering algorithm combining landmark points and autoencoders.This model uses the weighted Page Rank algorithm to assign weights to the sample points of each view,and then selects the landmark points of each view to reduce the scale of the data.The independent similarity matrix for each view is generated directly from the data using a convex quadratic programming function,and the multi-view consensus similarity matrix with low storage is input into the autoencoder to replace the Laplacian matrix eigen-decomposition step,thereby reducing the computational complexity of the algorithm.The algorithm proposed in this chapter can be applied to multiview datasets with large-scale datasets.Experiments on multi-view datasets demonstrate the superiority of the algorithm in operational efficiency.(3)In order to solve the problem of community discovery in complex networks with both large-scale properties and multi-view attributes,based on the above construction scheme of landmark representation and similarity matrix,this section proposes a deep multi-view community discovery algorithm based on approximate autoencoder structure.Inspired by deep autoencoders and non-negative matrix factorization models,this model transforms a singlelayer non-negative matrix factorization model into a multi-layer non-negative matrix factorization model containing encoding and decoding layers.The model proposed in this chapter can learn better low-dimensional network feature representations,so that more accurate multi-view independent community structures can be obtained.The multi-view network structure is fused at the class partition space level,and view weights are introduced to maintain the heterogeneity of the multi-view network,thereby enhancing the integrity of the captured data.Furthermore,to make the model suitable for complex networks with large-scale nature,the weighted Page Rank method is introduced in each network for landmark selection.Experiments on multiple complex networks with multi-view structures show that the proposed algorithm can achieve more realistic community discovery results.
Keywords/Search Tags:data mining, multi-view clustering, autoencoder, non-negative matrix factorization, community discovery
PDF Full Text Request
Related items