| Network representation learning aims to map information network input vectors to a lowdimensional space,which is of great importance in network analysis tasks.Based on the complexity and semantic diversity of information networks,network representation learning methods are subdivided into homogeneous network representation learning and heterogeneous network representation learning,while the latter heterogeneous information networks have more complex structural and semantic information,and also bring great challenges to network representation learning methods.Most of the traditional network representation learning methods are based on random wandering approach and meta-path based approach,however,these methods are based on shallow network models,which are difficult to obtain the rich structural and semantic information in heterogeneous information networks.The graph convolutional neural network can obtain the topology of the network better,but it can only be studied on homogeneous information networks,ignoring the rich semantic information in information networks.How to improve the above difficulties in the task of analyzing heterogeneous information networks,this thesis makes the following two parts of research.(1)Heterogeneous information network representation learning method based on metapath and attribute fusion is proposed.Firstly,for the heterogeneity of nodes in the heterogeneous information network,the target node and neighbor node feature vectors are dimensionally transformed to obtain the correlation between the target node and different connected objects,and then the updated target node vector is generated after feature fusion.Secondly,node-level attention is used on each meta-path to learn the embedding vector of the target node based on the meta-path.Finally,a Multilayer Perceptron(MLP)is designed to learn the final node embeddings for different meta-paths using semantic-level attention.This part of the study is compared with six benchmark experiments on three public datasets in the areas of node classification,clustering,parameter analysis and visualization,and the results of the experiments show that this part of the study is effective for the task of analyzing heterogeneous information networks.(2)Heterogeneous information network representation learning method that incorporates meta-paths and graph convolution is proposed.Since graph convolution has better ability to obtain network structure information,but it is not well applied to heterogeneous information networks.Meanwhile,for how to efficiently acquire rich semantic and structural information in heterogeneous information networks,this part of the study designs a heterogeneous association matrix on meta-paths in heterogeneous information networks to acquire higherorder semantic information of target nodes.The study is divided into semantic learning phase,structural learning phase and information fusion phase.In the semantic learning phase,the semantic-level embedding vectors of nodes are learned using the HAN model,and in the structural learning phase,the designed heterogeneous association matrix and node feature vector matrix are used as inputs to the graph convolutional neural network and the node embedding vectors are obtained in the structural learning phase.Finally,the node embedding vectors from the two phases are aggregated using an attention mechanism in the information fusion phase to generate the final node embedding representation.This part of the study uses three public datasets for experiments and compares them with seven benchmark experiments in node classification,clustering,ablation experiments and parameter analysis,and the final experimental results validate the effectiveness of the research method. |