Font Size: a A A

Research And Application Of Heterogeneous Information Network Embedding Model

Posted on:2022-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y TangFull Text:PDF
GTID:2480306764977029Subject:Computer Software and Application of Computer
Abstract/Summary:PDF Full Text Request
Modeling interactive data as heterogeneous information network is a way to mine the potential value of data.Through network embedding,this kind of data can be converted into the format of input that meets the requirements of downstream prediction algorithms.However,there are still some problems in the current research,such as: the ability of tools that extracting heterogeneous feature is not strong;Modeling network semantics is insufficient;The auxiliary information is rich,and the attributed information and structural information are difficult to be fully integrated.This thesis studies the above problems and the main work of this thesis is as follows:(1)Two new heterogeneous networks are defined,in which the number of relationship,attribute of relationship,attribute of node are considered;(2)For the two newly defined networks,the corresponding network embedding methods are designed respectively,and the effectiveness of the embedding methods is verified by tasks such as classification and clustering;(3)According to the needs of the subject,a system using two embedding methods is designed and implemented.In the process of embedding,this thesis deals with different network characteristics:(1)For the network with multiple relationships between nodes,it is transformed into multiple subnets to realize that the node pair with only single relationship in the original network keeps the structure unchanged in the subnet,and the node pair with multiple relationships has only one relationship.(2)The relational attribute is converted into weight.When using meta path to sample the network,sampling with weight is realized,which enhances the discrimination of different node objects in the same type of nodes.(3)In the fusion of structural information and attributed information,the scaling coefficient is calculated through the average value and variance of the two types of feature values,so as to achieve the consistency of the distribution of the two types of feature values and avoid the interaction of feature information after fusion.The main innovations of this thesis are as follows:(1)when modeling the network,the multiple relationships,attribute of relationship and attribute of node between nodes are considered to make it more in line with real life;(2)When using meta path to sample the original network,sampling with a smaller granularity is realized based on attribute of relationship,which improves the ability of extracting heterogeneous feature;(3)When fusing different information,a method of integrating distribution of feature value is designed to realize the mutual supplement of structural feature and attributed feature information.In this thesis,two new heterogeneous networks are defined and the corresponding embedding methods are designed.These embedding methods use the additional information in the network,which are better than the traditional embedding methods in clustering and classification tasks.By comparing with the variant model,the effectiveness of each optimization module is verified.These embedding methods can be applied to the professional domain data in line with the network definition,convert it into structured data and improve the utilization of data.The behavior patterns between objects can be mined through the prediction algorithm,which enhance the performance of prediction.
Keywords/Search Tags:Heterogeneous Information Network, Network Embedding, Multiple Relationships, Meta Path
PDF Full Text Request
Related items