| Graph neural networks(GNNs)is a method aimed at applying traditional deep neural networks to graph-structured data,and has achieved great success in the field of graph representation learning,with applications in various tasks such as recommendation systems,image classification,molecular property prediction,and relation extraction.However,the current mainstream message passing GNNs face serious scalability and oversmoothing issues,making it difficult to extend them to large-scale graph data.To address these problems,this thesis optimizes the scalability of GNNs and uses a GNN structure with good scalability to solve the oversmoothing problem while maintaining the model’s expressive power.The main contributions and work of this paper are as follows:1.To address the issues of over-smoothing caused by similar output features of deep graph neural networks and poor scalability,this thesis proposes a Message Gain Aggregation Architecture(MGAA)based on the separation of nodes and their neighbors.The proposed model trains by individually processing the target nodes and their neighbors,thus preserving the local feature information of the nodes,avoiding the convergence of feature representations among nodes,and preventing over-smoothing of the outputs.The model utilizes a multilayer perceptron to extract local information of nodes,generating intrinsic results and quantifying the message gain of nodes.It further employs linear diffusion to aggregate and combine the message gains of multiple-hop neighbors of the target node,forming neighbor gains(i.e.,influence factors).Finally,the model outputs a combination of the intrinsic results and the neighbor gains.Theoretical complexity analysis and experimental evaluation demonstrate that the MGAA model maintains linear complexity and its performance continues to improve with an increase in the number of layers,thereby confirming its scalability and ability to address the over-smoothing problem.Experimental results also indicate that the MGAA model achieves the best performance on the Cora,Pub Med,Coauthor,and Yelp datasets.2.After further research on the basic MGAA model,it was found that the message gain aggregation process of deep MGAA models suffers from oversmoothing and overcorrelation of the message gain,leading to high redundancy of the neighbor gain and low useful information content.To address this problem,this thesis proposes two improvement operations: explicit regularization,adding a loss function regularization term to reduce the dimensionality correlation and similarity between message gains and nodes;inter-layer normalization,adding a normalization layer after aggregation to fix similarity and correlation metrics,to solve oversmoothing and overcorrelation problems caused by message gain aggregation.Experiments show that the improved DMGAA model has the ability to aggregate more hop neighbor gains and achieves better performance than the MGAA model in deep network settings.Meanwhile,DMGAA achieves good performance in comparison experiments with multiple GNN models. |