Font Size: a A A

Research On Heterogeneous Information Network Representation Learning Based On Attention Mechanism

Posted on:2024-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:T T LiuFull Text:PDF
GTID:2568306923471334Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In the Internet era today,the amount of network data is growing explosively,and the form of data is complex and changeable.In real life,the heterogeneous information network(HIN)can be seen everywhere,such as online social network,literature citation network,biological protein network and shopping network and etc.HIN is rich in topological and semantic information.HIN representation learning is the premise of data mining,information retrieval and analysis.And it holds great practical importance and research significance in fields such as social networking,e-commerce recommendation,and medical research.Traditional network representation learning models data as homogeneous information network.The diversity of objects and connections within the network is ignored,resulting in substantial information loss.In recent years,researchers have started to model data as heterogeneous information network to fully learn the topology and semantic information of the network.However,when traditional network representation learning algorithms are directly used on heterogeneous information networks,information learning is not comprehensive and is less interpretable.Although a lot of progress has been made in heterogeneous information network representation learning.There are still some problems,such as insufficient algorithm pertinence,incomplete information extraction and neglect of time evolution pattern.As the most popular and the best special structure for network learning,the attention mechanism plays a key role in the improvement of heterogeneous information network representation learning ability.Therefore,towards solving the said problems,this thesis studies the representation learning of heterogeneous information network based on attention mechanism.The main work is as follows:(1)To address the problems of insufficient pertinence and incomplete information extraction of heterogeneous information network representation learning algorithms,a heterogeneous information network representation learning algorithm based on multi-view fusion is designed:MFHE.MFHE mainly includes node feature space transformation,subview information extraction,multi-view information fusion and training modules.Node feature space transformation unifies the representations of different types of nodes in the same dimensional and lower dimensional feature space.Subview information extraction divides the node information associated with subviews according to the given meta-path rules,and models the local information through a multi-headed attention mechanism.Firstly,the local information of each subview is extracted.Secondly,the correlation of the same node among subviews is calculated using the spatial matrix.Finally,the global information extraction is completed by summing the fused view information for all subviews.According to the results of the experiment,the performance of classification for MFHE on ACM and DBLP datasets is improved by 1.72%and 0.90%,and the clustering performance is improved by 3.08%and 1.93%.MFHE enhances the quality of representation learning for heterogeneous information networks,resulting in improved richness and comprehensiveness.Furthermore,it exhibits greater accuracy in forecasting outcomes pertaining to downstream tasks.However,it ignores the time evolution law in the network and is not suitable for tasks such as dynamic link prediction.(2)Aiming at the problem that MFHE ignores the time evolution law,a dynamic heterogeneous information network(DHIN)representation learning algorithm based on temporal attention mechanism is designed:DynTAM.DynTAM divides the dynamic heterogeneous information network into multiple time snapshot networks according to the designed time interval.The topology information of the snapshot network is learned by MFHE,and then the attention value of the historical snapshot network to the nodes in the current network is obtained based on the temporal attention mechanism.The network representation with structural and temporal information is obtained by weighted summation.Node classification,node clustering and link prediction experiments are carried out on Wikipedia and Reddit datasets.The results of node classification and node clustering of DynTAM are close to MFHE algorithm.However,in the link prediction experiment,compared with the dynamic algorithm,the accuracy is improved by 0.70%and 0.95%on average.Compared with the static heterogeneous information network algorithm,the accuracy is improved by 11.70%and 8.04%on average.Experiment results show that DynTAM has the ability to learn both the heterogeneity and dynamics of the network,and is suitable for downstream tasks such as future time prediction.
Keywords/Search Tags:Network representation learning, heterogeneous information network (HIN), dynamic heterogeneous information network (DHIN), attention mechanism, meta-path
PDF Full Text Request
Related items