Font Size: a A A

Research On Graph Analysis Oriented Representation Learning Technologies

Posted on:2019-06-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y FangFull Text:PDF
GTID:2428330611493630Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Graph is an important data representation which appears in a wide diversity of realworld scenarios,e.g,social graph in social networks,citation graph in research areas,knowledge graph,etc.Effective graph analysis provides users a deeper understanding of what is behind the data,and thus can benefit a lot of useful applications such as recommendation system,natural language processing,visualizing,etc.However,most graph analysis methods suffer from the high computation and space cost.Graph representation learning,i.e.,graph embedding is an effective yet efficient way to solve the graph analysis problem.It converts the graph data into a low dimensional space in which the graph structural information and graph properties are maximally preserved.Our research focuses on two main techniques of graph embedding,that is,network embedding and knowledge embedding.Existing graph embedding methods face the challenge that they tend to suffer from low computational efficiency and data sparsity.In particular,existing network embedding models are also unable to handle the the heterogenous information networks(HINs),which are more common in the real-world scenarios than those homogenous information networks.In specific,we propose four graph embedding models to address the above issues,one for homogenous network,two for heterogenous network and one for knowledge graph.We first introduce a homogenous network embedding model named BimoNet which is based on two parts,i.e.,the bi-mode embedding part and the deep neural network part.In bi-mode embedding part,the add-mode and subtract-mode are used to express the entity-shared and the entity-specific features of edges respectively.The deep autoencoder could preserve the structure information of the edges.Afterwards,by jointly optimizing the objective function of these two parts,BimoNet could preserve both the semantic and structural information of edges.In experiments,the datasets we adopt is a homogenous information network,which is an author network connected by shared research interests with the only node type being author.We evaluate BimoNet on the benchmark task of relation extraction.Nevertheless,in real-world graph data,heterogeneous information networks are more common than those homogeneous information networks.Hence we propose a novel heterogeneous network embedding model TransPath which incorporates the translation mechanism with the meta-paths.It regards a meta-path as a translating operation from the first node to the last node.Moreover,we propose a user-guided meta-path sampling strategy which takes users' preference as a guidance,which could explore the semantics of a path more precisely,and meanwhile improve model efficiency via the avoidance of other noisy and meaningless meta-paths.We evaluate our model on two large-scale realworld datasets DBLP and YELP,and two benchmark tasks similarity search and node classification.The representation ability of meta-path is still limited as there is an apparent information loss when employing paths to delegate neighborhood structure between two nodes.Hence we offer a novel mechanism to capture via metagraphs the features of nodes,which retain more semantic and structural information than paths.We propose to construct HIN triplets using nodes and metagraphs between them.Then Hadamard function is applied to encode the relationships between nodes and metagraphs,and the probability whether a HIN triplet is positive can be evaluated.Further,to better distinguish the symmetric and asymmetric cases of metagraphs,we introduce a complex-embedding scheme,which is able to precisely express HIN nodes.We evaluate the proposed model,namely,metagraph2 vec on real-life datasets.We also propose a new knowledge embedding model named Bi-Mult which utilizes a dynamic bi-mode embedding mechanism to represent the knowledge graph.It combines both the advantages of compositional models and translation models.In the bi-mode embedding,an entity(resp.relation)embedding is decomposed into two parts,one is to represent intra-entity(resp.relation)state and the other is for inter-entity(resp.relation)state.In addition,the bi-mode relation embedding enhances relation's interaction with entities,resulting its improvement on handling antisymmetric relations.Moreover,we incorporate mapping matrices in translation models through bi-mode entity embedding to construct dynamic embeddings for expressing complex relations.In experiments,we evaluate our method on the benchmark datasets and task of link prediction.In conclusion,graph embedding is proposed to analyze the rich information behind the graph data.However,traditional graph embedding models suffer from computational efficiency and data sparsity problem.To address these issues,we propose four new graph embedding methods,more specifically,network embedding and knowledge embedding methods,and they all provide a new angle to realize the representation learning of a graph.In experiments,our models are all proved to outperform other previous baseline models.
Keywords/Search Tags:graph analysis, representation learning, graph embedding, network embedding, knowledge graph, knowledge embedding
PDF Full Text Request
Related items