The Research Of Relational Learning In Heterogeneous Information Networks

Posted on:2018-08-10

Degree:Master

Type:Thesis

Country:China

Candidate:Q Gu

Full Text:PDF

GTID:2310330518495399

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

In recent years, the boom of heterogeneous information networks(HIN), especially the emergence and development of knowledge graphs,has accelerated the research of related techniques in heterogeneous information networks. A number of data mining tasks have been explored in these networks. Among them, link prediction which aims at predicting the links between entities is one of the important tasks, it is also the foundation of solving many other issues in HIN. Relational inference refers to inferring the latent relations in the network by analyzing the complex network structure and the various semantic meanings of the heterogeneous information network, and it is a guideline to solve link prediction tasks.In this paper, we first study the basic similarity measure served for the relational inference. This paper proposes a Monte Carlo simulation based random path sampling algorithm, RSSim, to solve the problem of time efficiency and memory consumption in traditional matrix chain multiplication based methods like PCRW and HeteSim. The paper also gives the theoretical proof of the size of random walkers. Experiments also prove that only a small number of walkers are enough to guarantee the accuracy of the similarity ranking, and the empirical formula of the similarity error is given.The mainstream method based on path features is Path Ranking Algorithm (PRA). It uses a two-step algorithm to complete the link prediction task. The first step is to take a traversal algorithm on the graph to find all the meta paths as features. The second step is to train a relational classification model by a meta-path-constrained random walk algorithm. In this paper, based on the RSSim, a novel relational inference method -- subgraph path extraction algorithm is proposed. It integrates the feature selection and the feature calculation processes of PRA algorithm, building features by searching and merging subgraphs of entities, which greatly saves the time cost in the process.In order to meet the requirement of relational inference under large-scale knowledge graph, this paper presents a distributed computing version of subgraph path extraction algorithm. It consists of two steps:distributed subgraph path feature computation and distributed multi-model training. The parallel algorithm solves the low efficiency problem of training models on a single machine. In the distributed system,the multiple models divided according to the relations will train simultaneously, which greatly improves the efficiency.

Keywords/Search Tags:

heterogeneous information networks, similarity measure, relational learning, link prediction, random walk

PDF Full Text Request

Related items

1	Research On Link Prediction Algorithm For Complex Networks Based On Similarity
2	Random Walk On Multi-relational Heterogeneous Networks
3	Research On Pervasive Link Prediction Model For Multiple Types In Heterogeneous Academic Networks
4	Alignment Of Multiple Literature Networks And Link Prediction Across Networks
5	Research On Link Prediction In Multi-Relational Networks
6	Research On Link Prediction Method Based On PU Learning
7	Similarity Index-based Learning Link Prediction In Complex Networks
8	A Study Of Link Prediction In Complex Networks Based On Local Structural Information
9	Link Prediction Method For Opportunistic Networks Based On Random Walk And Deep Learning
10	Link Prediction In Complex Dynamic Networks