Font Size: a A A

Research On Similarity Search Algorithm Of Heterogeneous Information Network Based On Hierarchical Attention

Posted on:2022-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y B WuFull Text:PDF
GTID:2480306731972519Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Real-world data is often combined with graphical structures,such as social networks.Such a heterogeneous network is called heterogeneous information network.Similarity measurement is a basic task in heterogeneous information networks,which has been widely studied and often used in information retrieval,recommendation and so on.Considering the current common similarity measurement algori thms,whether based on meta-path or meta-graph,there are some problems.In other words,when calculating the similarity of the target node,it is not taken into account that the importance of different neighbor nodes to the target node should be different.Similarly,when capturing relational semantics,the importance of different meta-paths of a target node is also different.Therefore,how to capture the importance of nodes and meta-paths in the network and use the important information in similarity mea surement has become an important step to accurately measure the similarity between objects.Since similarity measurement is a basic function in many applications,including recommendation,users' perception of the target node is also an important factor.T herefore,when measuring the similarity between nodes,we should not only consider the semantic information,but also consider the attribute information of the nodes in the network to achieve more accurate accuracy.This paper carries out the research acco rding to the above problems existing in the existing similarity search methods,the specific research is as follows:(1)Aiming at the problem that the existing similarity search algorithms do not take into account the different importance of different nod es and meta-paths,a similarity measurement algorithm HAtt Sim based on hierarchical attention is proposed to capture the importance of different nodes and meta-paths.firstly,the algorithm projects different types of nodes in the meta-path to the same feature space,uses the self-attention mechanism to calculate the weight coefficient of neighbor nodes to the target nodes,and generates node-level embedding.The nonlinear function is transformed into a new node-level embedding,and the weight of each meta-path is obtained by the product of the transformed node-level embedding and the semantic-level attention vector through the softmax function.At the same time,the semantic-level attention is fused to generate the final node vector to calculate the similarity between the object nodes.The experimental results show that,compared with the five contrast algorithms,the HAtt Sim algorithm proposed in this paper has the best performance in classification and clustering tasks,and the results of similarity search are more accurate.(2)In order to solve the problem that the above algorithm only uses the semantic information in the network and does not consider the attribute information of the object itself,a similarity measure algorithm HAtt Sim Ext which combines external attribute information and hierarchical attention is proposed.On the basis of HAtt Sim,the algorithm introduces attribute information,that is,after obtaining the node vector of the target node under the hierarchical attention,the attribute inf ormation of the target node is projected into the feature space of the node,and the weight between each attribute of the target node and the target node is learned by using self-attention,the vector embedding of the attribute of the target node is obtain ed,and the target node vector and attribute vector are spliced.The final node vector is obtained to calculate the similarity between the object nodes.The experimental results show that,compared with the three comparison algorithms,the HAtt Sim Ext algorithm proposed in this paper has advantages in similarity sorting,and has better performance in classification and clustering tasks.
Keywords/Search Tags:Heterogeneous information network, Similarity search, Attention mechanism, External attribute information
PDF Full Text Request
Related items