Research On Efficient And Interpretable Graph Mining Technology

Posted on:2023-02-03

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y M Yang

Full Text:PDF

GTID:1520306917479864

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

In recent years,a large amount of relational data has been generated by the Internet.They can usually be described by the data structure of graphs(or networks).Graph data contains a lot of useful knowledge and information,which is of great significance to production activities in the real world.Therefore,how to effectively mine these knowledge and information is a very attractive research topic.The research on discovering knowledge in graph data has roughly gone through three surges,i.e.,traditional graph mining,graph embedding,and graph neural network.In the past few years,due to the rapid development of deep learning and its wide application in the graph research field,deep graph learning techniques have made substantial progress in many graph analysis tasks.However,in this dissertation,we notice that most existing graph data mining methods still face challenges in four aspects:(1)The diversity of network modality.The graphs in the real world are usually difficult to be simply described by nodes and edges,but appear in various complex forms such as attribute graphs,heterogeneous graphs,and etc.?(2)The interpretability of the model.In practical application scenarios,users not only expect the model to make predictions with high accuracy,but also want to know why a specific prediction is made?(3)The large scale of the network.A real-world network is usually very large,with a massive number of of nodes and edges.Therefore,a desired algorithm is expected to achieve low time complexity and good scalability in order to be effectively applied to realworld networks?(4)The scarcity of supervision labels.Current mainstream semi-supervised graph deep learning techniques usually rely on high-quality human-annotated labels,which are often expensive to acquire,or even impossible due to the concern of privacy.In this dissertation,to address the above challenges,we aim to design and implement several efficient,interpretable,and user-friendly methods to mine valuable knowledge and information from graph data.Specifically,our main contributions are summarized as the following four aspects.(1)We propose a new attributed graph clustering framework.Attribute graph clustering is to identify clusters that show both structural cohesiveness and attribute homogeneity.We note that existing methods ignore such an issue,i.e.,in an attribute graph,different clusters usually tend to correlate to different attribute dimensionalities.To this end,we define and optimize a weight vector that describes the correlation between clusters and attributes.This can facilitate the model to capture personalized correlation pattern between structure and attribute.Finally,we formulate the attribute graph clustering problem as a bi-objective optimization problem,and develop an efficient heuristic optimization algorithm.In optimization,the correlation weight vector can be updated synchronously or asynchronously.Theoretical analysis and experimental evaluation show that the framework has good effectiveness as well as high efficiency.(2)We propose a novel graph substructure assembling neural network.Many existing methods have achieved high performance in the task of graph classification.However,none of them can effectively identify discriminative substructures,which limits their interpretability.In this study,we aim to design a graph neural network that is able to not only achieve high classification performance but also identify task-specific discriminative substructure features,thereby improving the interpretability of the model.Considering that in graph data,the neighbors of nodes have no natural order,we further propose an attention-based sorting mechanism for automatically learning the order of neighbors.This helps the model achieve higher performance as well as lower variance.The experimental results show that the proposed method can achieve high classification performance,and can effectively discover discriminative substructure features,facilitating good model interpretability.(3)We propose an innovative heterogeneous graph convolutional neural network.In this study,we notice that most existing heterogeneous graph neural network methods have two limitations: 1)They need users to manually specify several useful task-specific meta-paths.This is a difficult task for users? 2)Before performing the graph convolution operation,they require additional and time-consuming pre-processing operations,which limit their model efficiency.To this end,we design an efficient network architecture,which has three key steps,i.e.,feature projection,object-level aggregation,and type-level aggregation.Theoretical analysis and experimental results show that the proposed method can automatically evaluate the importance of all possible meta-paths and identify useful meta-paths for a specific task.By exploiting the structural features conveyed by these useful meta-paths,the model can achieve high performance.Besides,the specific semantics conveyed by these meta-paths facilitate good interpretability of the model.(4)We propose a novel self-supervised heterogeneous graph pre-training framework.Traditional semi-supervised graph neural networks are usually trained under the guidance of supervision labels,while these labels are expensive to acquire.To alleviate this problem,researchers have recently proposed several methods to pre-train graph neural networks in a self-supervised manner.However,the performance of these methods usually relies on various specific strategies for generating positive and negative samples,limiting their flexibility and generalization ability.To address this issue,the proposed framework generates pseudo-labels through structural clustering on heterogeneous graphs,and uses the obtained pseudo-labels to guide the learning of heterogeneous graph neural networks.It does not need to generate any positive or negative samples.We transfer the learned representations to various downstream graph analytical tasks.The experimental results demonstrate that the proposed framework can achieve superior performance,even surpassing some traditional semi-supervised baselines.Finally,we conclude this dissertation,and discuss the possible future research directions for graph mining technology.

Keywords/Search Tags:

Graph Mining, Attribute Graph, Heterogeneous Graph, Graph Neural Network, Graph Clustering, Graph Classification

PDF Full Text Request

Related items

1	Research On Graph Classification Based On Graph Neural Networks
2	Some Problems Of Geometric Graph Theory
3	Research And Application Of Heterogeneous Graph Network Algorithms
4	Research On Hierarchical Architecture Graph Classification Based On Deep Learning
5	Research On Graph Classification Methods Based On Graph Substructures
6	Structure-based Updatable Graph Pooling For Graph Classification
7	Graph Neural Networks For Complex Heterogeneous Graph Representation Learning
8	Research On Graph Classification Technology Based On Graph Neural Network
9	Attributed Graph Clustering Based On Graph Convolutional Networ
10	A Study On Graph Pooling Method For Graph Classification