Font Size: a A A

Entity Association Mining Based On Heterogeneous Graph

Posted on:2019-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:D Q KongFull Text:PDF
GTID:2348330542991039Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the development of information technology and the popularization of smart devices,multi-modal heterogeneous data such as digital images,texts,audios and voices is becoming the main form of information,and is changing the way people live and work.How to mine and analyze the association between heterogeneous data has become an important research area in cross-media and data mining.Targeting at the heterogeneous graph data,this thesis conducts in-depth research three key issues,i.e.,recommendation system,cross-domain link prediction and image-document matching.The main contri-butions are in the following three folds.1.By jointly learning the auto-encoder(AE)and matrix factorization(MF)model,we propose a graph embedding based matrix factorization model(GEMFM)in user-item heterogeneous graph.We use the hidden layer of the AE to optimize the latent factor in the original MF model and improve the nonlinear representa-tion ability of the model.The experimental results verify the effectiveness of the GEMFM algorithm.2.This thesis proposes a multi-stage cross-domain link prediction model(MCLPM)to solve link prediction problem.In the recall model,based on the statistical properties of the time series,a subset of candidates is generated by using nearest neighbor search.In the matching model,an active learning based negative sam-pling algorithm is proposed to learn the hard samples.Experimental results show the effectiveness of the proposed algorithm in cross-domain link prediction.3.This thesis proposes a deep image-document matching model(DIDM)by utilizing the heterogeneous relation between image and document.DIDM learns image and document representations by CNN and Word2vec respectively.In order to establish the exact matching between image and document representation in the isomorphic semantic space,the triplet loss is introduced to achieve cross-media image docu-ment matching.Experimental results in large-scale datasets well demonstrate the effectiveness of the proposed method.
Keywords/Search Tags:heterogeneous graph, link prediction, joint learning, image-document matching, deep learning
PDF Full Text Request
Related items