Font Size: a A A

Clustering Research Based On Deep Learning

Posted on:2022-11-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z X XuFull Text:PDF
GTID:2518306758491704Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Clustering is a classic task in the field of data mining,whereby unlabelled input data are organized into clusters through predefined similarity measures.Over the past years,various clustering approaches are proposed to solve realworld problems such as text clustering and image clustering.With the emergence of graph-structured data,such as biological networks and social network,the partitioning of the nodes form an attributed graph has attracted a lot of attention.Different from text and image data,each node in attribute graph has a set of attribute feature.The attributes information represents the feature values of a node itself,while the structural information indicates potential similarity between nodes.Therefore,it is an important issue that how to effectively use the feature information and structure information in the graph data for clustering.Classical graph clustering algorithms construct a similarity matrix of the node features and then perform clustering on the matrix,which emphasizes the importance of structural information.Although later researches attempt to integrate both node feature and network structure to cluster,these methods rarely explore the use of deep learning for clustering.Recently,in view of the rapid development of deep learning,people have begun to use deep learning methods for clustering.The basic deep clustering methods first learn the effective representation of the data,and then apply the clustering method to classify the data representation.Deep graph clustering methods are roughly divided into two categories: DNN-based methods and GCN-based methods.DNN-based methods have been exploited to learn graph representations based on both node feature and network structure information.GCN-based methods use GCN to encode both of the node feature and network structure for node representation.The latest deep clustering methods take advantages of the topological structure information and the node feature at the same time,and have been used to achieve excellent performance.However,all these models usually tend to pay more attention to learn structural information,but rarely process feature information.The loss of information in the feature learning process is ignored,which leads to insufficient information to be learned and captured.Meanwhile,feature learning representation is the key content to obtain the target cluster distribution,which directly leads to the low accuracy of clustering.It is necessary to fully learn features in the entire clustering process.In addition,these deep clustering methods ignore the link relationships among nodes,which may improve the accuracy of training clustering results.Motivated by the above works,we propose a dense DNN structured clustering network in this paper.In order to address the problem of insufficient feature learning and enhancing the flow of features between layers,we propose a multi-layer dense deep neural network to obtain node feature representations,which incorporates the node representations learned by each layer.Our contributions can be summarized as follows: 1)We propose a dense DNN structured clustering network(DDSCN)for graph clustering.We let GCN learn the structural information of the original graph data.Furthermore,we make use of a transfer operator to deliver the node representation learned from the dense DNN to the GCN layer,which promotes the learning of structural information in GCN.Meanwhile,we add a dual self-supervision module to update two modules at the same time.The method realizes the reuse of information by jointing structure learning and feature learning processes.2)We propose a multi-layer dense deep neural network to obtain node feature representations,which establishes dense connections between DNN layers and realizes feature reuse between dense DNN layers.Especially,a smoothness regularization term is generated to optimize the update process of the entire model by measuring the similarity between nodes.We design the smoothness regularization term to restructure our loss function to consider the direct link relationship between nodes.3)We evaluate our proposed framework DDSCN in four real-world datasets,which shows significantly outperforms state-of-the-art methods.It demonstrates that the DDSCN is effective for graph clustering.
Keywords/Search Tags:Deep Graph Clustering, Graph Convolutional Network, Dense Deep Neural Network, Regularized Optimization, Self-supervised Learning
PDF Full Text Request
Related items