Research On Degree-biased Sampling Algorithm For Large-scale Network Representation Learning

Posted on:2020-12-09

Degree:Master

Type:Thesis

Country:China

Candidate:Y Y Zhang

Full Text:PDF

GTID:2370330590458335

Subject:Computer system architecture

Abstract/Summary:

PDF Full Text Request

Network representation learning aims to represent nodes in a network as low-dimensional,dense real-valued vectors,thus serving as features for classical network analysis tasks such as classification,prediction and visualization.Traditional network representation learning methods use matrix decomposition for dimension reduction to obtain node representations.Due to lack of scalability and universality,they have been gradually replaced by a novel kind of methods based on deep learning.Methods based on deep learning usually adopt random walk to sample node sequences,and neural network are used for training node vectors.However,they all ignore the scale-free characteristic of real networks and adopt a �one size fits all' sampling strategy for each node in networks,which could bring plenty of redundant information in the generated node sequences and make it unable to well preserve the structures of original networks,then greatly limiting the effectiveness and efficiency of network representation learning.Therefore,a degree-biased variable-length random walk with backtracking,DiaRW is proposed.A degree-biased backtracking mechanism is introduced to uniform random walk,by letting walks from high-degree nodes backtrack in a probabilistic way,where the topological structures could be extracted more fully with the central role of high-degree nodes.Meanwhile,a centrality based variable-length strategy is designed in replace of the fixed-length ones,aiming to reduce the redundant information collected by low-degree nodes.DiaRW focuses on the scale-free characteristic of real-network,making the extraction for structure information more efficient and accurate,improving the effectiveness and efficiency of network representation learning.The experimental results show that DiaRW can greatly improve the efficiency of network representation learning while ensuring the quality of node vectors.For a network with millions of nodes(YouTube),it only takes 58 minutes to finish representation learning for all the nodes,which is tenfold faster than Node2 Vec.Moreover,the learned vectors can obtain 8.1% and 9.6% improvements on Micro-F1 and Macro-F1 for node multi-label classification task.

Keywords/Search Tags:

Network Representation Learning, Scale-free Network, Random Walk

PDF Full Text Request

Related items

1	Research On Network Representation Learning Algorithm Based On Random Walk
2	Research On Network Representation Learning Algorithm Based On Connectivity Perception And Deviation Feedback
3	Research And Implementation Of Network Representation Learning Algorithm Based On Time Sequence Information
4	Multi-granularity Complex Network Representation Learning Based On Random Walk
5	Research On Attributed Network Representation Learning Methods
6	Research On Network Representation Learning Method For Non-attribute Graphs
7	Research On Network Representation Learning Method Based On Edges And Attributes
8	Research On Link Prediction Algorithms Based On Topology Structure And Network Representation Learning
9	Representation Learning For Large-scale Attributed Networks
10	Modeling Of Scale-Free Networks Through Random Walk