Font Size: a A A

Research And Implementation Of A Network Embedding Method Based On Multi-hop Random Walk

Posted on:2021-11-30Degree:MasterType:Thesis
Country:ChinaCandidate:H WeiFull Text:PDF
GTID:2518306050965339Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the coming of information era and the popularity of mobile smart devices,the amount of data that people create every moment in their daily lives is immense.Network is a common form of data used to represent associations between objects.Thus it can be seen that the network is ubiquitous in our daily life,covering all aspects of human life.So a good network representation method is needed to make the analysis and utilization of network data more easy and efficient.Network embedding is a technique to construct a good representation of the underlying information of the network.The purpose of network embedding is to use the original network to obtain a low-dimensional vector representation of the network data.This low-dimensional vector saves the relevant potential information of the original network,making it easier for us to analyze and effectively use the network by using machine learning data.Deep Walk is a network embedding method.It uses a random walk method to obtain the sequences of nodes,and then uses these node sequences as inputs to apply to the word embedding method to obtain a low-dimensional vector representation of the network.However,because of the imbalance of network in reality,which is to say,clusters are likely to exist in real networks.A cluster may be connected to another cluster only by a few nodes,besides,method for obtaining the sequences of nodes by random walk in such a network,the selection of the initial point has a certain impact on the node sequences generated by random walk.What's more,it is very likely that the nodes outside the cluster will not be easily access because random walk is more likely to travel within the cluster,which causes many loss of nodes information and affects the final embedding result.To solve the above limitations of the network embedding in networks with clusters,after reading a large amount of literature,we propose a new Network Embedding method NEMRW(Network Embedding with Multi-hop Random Walk).The NEMRW first joins the node sequences obtained through the multi-hop random walk.The multi-hop random walk improves the probability of jumping out of a cluster during the travel in the network,thus the problems mentioned above can be effectively avoided.Then the obtained node sequences are applied to the skip-gram model,and the hidden layer of the skip-gram model will be obtained by training a fake task.At the same time,NEMRW also uses negative sampling in word2 vec to optimize the skip-gram model.In this thesis,multi-hop random walk is used to increase the access probability of nodes outside the cluster.The purpose of increasing the node information contained in the node sequences can be achieved,so that the result vector can save more information on the uneven network,avoiding the limitations of random walks on cluster-containing networks.At the end of this thesis,we use the proposed method to obtain a low-dimensional vector of the network on multiple data sets.We also obtain the low-dimensional vector on the same data sets by many other network embedding methods,and then we apply these embedding results to the link prediction task and node classification task respectively.The link prediction task and node classification task used in the experiment is exactly where NEMRW plays a role.Therefore,before the experimental analysis of each specific task,this thesis introduces the specific application scenarios of NEMRW.In addition,this thesis also conducted experiments on the hyperparameter of multi-hop random walk in NEMRW,and discussed the influence of the choice of the hyperparameter on the experimental results.What's more,the experiment on convergence of NEMRW is also carried out.The experimental results show that the proposed method improved the experiment effect on link prediction task and node classification task,which demonstrates that the proposed network embedding method can save more information of the original network when the network is unbalanced,and thus it will be more helpful to the application of network data in machine learning tasks.
Keywords/Search Tags:Network Embedding, Deep Walk, Multi-hop Random Walk, Cluster
PDF Full Text Request
Related items