| As the popularity of the Internet and the continuous improvement of data storage capacity,the scale of real-world data is experiencing rapid growth.Data entities are interconnected through corresponding association rules,forming large and complex network structures(graphs),such as social networks,protein networks,and transportation networks.These graph-structured data not only exhibit complexity and high dimensionality but also entail a significant amount of unknown correlation information and implicit relationships.However,due to limitations in computational resources(such as time and space resources),effectively mining and analyzing valuable information from graph data requires appropriate preprocessing and dimensionality reduction operations.Graph representation learning,as an emerging machine learning technique,is capable of transforming complex graph-structured data into vector representations in low-dimensional vector spaces,thereby providing a modeling foundation for subsequent tasks such as prediction and classification.This thesis focuses on leveraging the characteristics of community structures in graph data to enhance the effectiveness of graph representation learning.In graph data,community structure refers to nodes in the graph forming distinct communities,where nodes within a community exhibit high similarity while differing significantly from nodes in other communities.Based on the distribution of features within community structures,graph data can be categorized into graphs exhibiting homophily and heterophily.In graphs with homophily,nodes often possess similar features,making it easier to form clusters of information and thereby facilitating information propagation.Conversely,in heterophily,due to the diversity in feature distribution,there exists a higher challenge for graph representation learning methods.Current graph representation learning methods primarily model long-distance dependencies or latent dependencies in graphs with heterophily,which inadequately explore complex graph relationships,thereby affecting the performance of graph representation learning algorithms.Therefore,to overcome the complexity brought by heterophily,this thesis aims to delve into the characteristics of community structures in graph data,exploring more effective methods for modeling complex relationships,fully exploiting and utilizing various structural information in graphs with heterophily,with the aim of providing theoretical and methodological support for the effective application of graph representation learning on graphs with heterophily.The specific research content and contributions are summarized as follows:1.This thesis proposes a heterophily graph neural network model,Signed Attention-based Graph Neural Network(SAGNN),aiming to effectively handle positive and negative relationships in graphs with heterophily,thereby enhancing the expressive power of graph representations.In this model,a signed attention mechanism is presented.Compared to traditional non-negative attention mechanisms,the signed attention mechanism possesses the capability to adaptively learn complex relationships between neighboring nodes.Specifically,this mechanism enhances the similarity of representations between positive neighboring nodes while reducing the similarity of representations between negative neighboring nodes,thereby comprehensively capturing associative information in graphs with heterophily.Furthermore,this study proposes a exactly higher-order neighborhood aggregation method to implicitly encode positional information,thereby specifying the importance of different neighborhoods.This method helps the model more accurately capture the local structural features of nodes in different neighborhoods,thereby improving the representation ability of heterophily graph data.Through extensive experiments on eight benchmark datasets,the proposed SAGNN model demonstrates superior performance compared to baseline methods in tasks such as node classification and visualization,validating the robustness and effectiveness of SAGNN in learning node representations.2.This thesis presents a heterophily graph neural network model,Adaptive Deep Convolutional Graph Neural Network(DCGNN),aimed at effectively handling cyclic structures in graphs to enhance the expressiveness of graph representations.Currently,graph neural network methods based on the message-passing scheme suffer from limitations in identifying cyclic structures in graphs with heterophily,namely the lack of accurate recognition of local substructures.Therefore,the DCGNN model introduces an adaptive deep convolutional approach to reinforce the message passing neural network’s ability to identify cyclic structures.The adaptive deep convolution replaces the simple stacking of first-order convolutional layers in the current message passing scheme by adaptively aggregating local high-order neighborhoods.This adaptive deep convolution can more accurately capture complex structural information in graphs with heterophily.Furthermore,this study theoretically demonstrates that DCGNN exhibits significantly enhanced expressive power compared to current message passing neural networks.Experimental results on eight real-world benchmark network datasets validate that the performance of the DCGNN model in handling graphs with heterophily surpasses several graph neural network methods specifically designed for heterophily.3.This thesis proposes a graph neural network model,Symmetric Positive Definite Graph Neural Network(SPDGNN),designed to effectively handle hierarchical structures in graphs with heterophily,thereby enhancing the expressiveness of graph representations.The method explores the possibility of graph embedding on a non-positive curvature matrix manifold.Symmetric positive definite matrix manifold is a Riemannian geometric space theoretically encompassing Euclidean and hyperbolic projective subspaces,exhibiting rich geometric structures and remarkable computational efficiency.In this study,a graph neural network model on the symmetric positive definite matrix manifold is devised,comprising feature transformation layer,information aggregation layer,nonlinear activation layer,and multi-class logistic regression.Leveraging the characteristics of the symmetric positive definite matrix manifold,this model enhances the modeling effectiveness of hierarchical structures in heterophily.Through extensive experiments on nine real-world complex network datasets,this study validates the superiority of the SPDGNN model over graph neural network models employing Euclidean and hyperbolic geometries in modeling complex networks containing hierarchical and grid structures. |