Font Size: a A A

Research On Heterogeneous Information Network Classification Based On Graph Convolution Network

Posted on:2022-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:X LiuFull Text:PDF
GTID:2518306536453244Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
The classification problem is an important branch of machine learning,where the classification of unknown data is predicted by learning the features of known data.Classification prediction models have a wide range of real-life applications: news text classification,spam filtering,bank customer ratings,risk assessment in the financial industry,etc.However,with the rapid development of the Internet and mobile communications,the volume of data continues to increase and the relationships between data become intricate and complex.In the past,when studying classification problems,scholars usually assumed that data were independent of each other,but such assumption is no longer applicable in the face of today's complex data relationships.In the face of Heterogeneous Information Network(HIN)data,which is abstracted from complex data relationships,the main approaches are:(1)ignoring the relationship features in the HIN and using traditional machine learning classification models;(2)ignoring the relationship types and using neighbor node labels to transform into features.For either way,there is a certain information loss in the exploitation of relational information between data.In this paper,from the perspective of directly using relational features for graph convolutional feature extraction,we investigate the direct exploitation of relationships in heterogeneous information networks.Based on the GCN(Graph Convolutional Network)model,respectively propose a GCN-HIN classification model based on aggregated information--Graph Convolutional Network Aggregation(GCN-A),and the GCN-HIN classification model based on residual information--Residual Graph Convolutional Network(Res GCN).The main contents of this paper are as follows.(1)The GCN-A algorithm is proposed.In this paper,the GCN-A algorithm is proposed based on the complex association relationships between the nodes studied in a heterogeneous information network.The algorithm firstly decomposes the heterogeneous information network into multiple homogeneous network structures carrying different semantics based on the meta-path segmentation method,and then uses GCN to extract features for each semantic homogeneous network in turn,and then fuses the features extracted from each semantic layer for feature fusion,and finally inputs them into the classifier for classification learning.The experimental results of this paper on three standard heterogeneous information network datasets show that the GCN-A algorithm outperforms the comparative heterogeneous information network classification algorithms.(2)The Res GCN algorithm is proposed.In this paper,based on the complex association relationship between the studied nodes in heterogeneous information networks the Res GCN algorithm is proposed to address the drawback that the GCN-A algorithm is influenced by the semantic information of the top-ranked homogeneous networks.For each layer of semantic features,the proportion of the study nodes' own features is strengthened by introducing the residual results,and the final objective of optimizing the classification prediction results is achieved.The experimental results in this paper on three standard heterogeneous information network datasets show that the Res GCN algorithm outperforms the comparative heterogeneous information network classification algorithms and improved the GCN-A algorithm.(3)Algorithm application analysis.In this paper,the terrorist attack event dataset and the terrorist relationship dataset are extracted from the terrorist event knowledge base and treated as heterogeneous information networks.The GCN-A and Res GCN algorithms are applied in these datasets.The experimental results on the two datasets show that both algorithms proposed in this paper outperform other heterogeneous information network classification algorithms in terms of classification prediction.
Keywords/Search Tags:Heterogeneous Information Network, Graph Convolution Network, Graph Convolution Network Aggregation, Residual Graph Convolution Network
PDF Full Text Request
Related items