| In a complex network composed of different information objects,the interaction of each network object also presents diversity,so a complex network composed of multiple types of nodes and relationships is called a heterogeneous network.In the context of many problems,it can reflect the rich and diverse information of the real world more realistically and effectively.However,the heterogeneous network has different influences on information update and diffusion due to different node types.Therefore,it is of great significance to study the classification of heterogeneous network nodes.The existing network node classification research is mainly based on the meta-path by prior knowledge,and each meta-path is a sequence of nodes;because the types of network nodes and links in the heterogeneous network are different,such differences are often not reflected in the meta-path,which makes existing node classification methods are not applicable in heterogeneous networks.Therefore,based on the heterogeneity of heterogeneous networks,this thesis proposes a heterogeneous network node classification method characterized by the link relationship between nodes.Firstly,the network representation learning is used to obtain the characteristics of the node relationship in the heterogeneous network.Then the classification model is trained and tested based on the obtained node features.Finally,the heterogeneous network node classification is realized.The main work of the text includes the following four points:(1)Heterogeneous network data preprocessing.In this thesis,the data nodes are first customized according to the form of node pairs.Then,based on the characteristics of the heterogeneous information network data,the random walk method is used to obtain the node data.(2)Network representation learning of nodes.First,at the input layer,based on the acquired node data,the corresponding data is obtained through preprocessing;then,in the hidden layer,the relationship characteristics of the nodes are learned by constructing a neural network;finally,the obtained node features are output through the vector at the output layer.(3)Classification of heterogeneous network nodes.Based on the obtained node feature vector,it is used as the classified data set;then it is trained and classified by LightGBM classification model;and compared with other methods,and finally,some results are visualized.(4)Based on real data,this thesis verifies the effect of the combination of network representation learning and LightGBM classification model on the classification of heterogeneous network nodes.The experimental results show that the combination of the two contributes to the improvement of classification accuracy of heterogeneous network nodes. |