Font Size: a A A

Embedding Model Based Knowledge Graph Completion

Posted on:2018-12-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z WangFull Text:PDF
GTID:1318330536976259Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Knowledge graph is a collection of triples in the form(subject,predicate,object)where the subjects and objects are entities,the predicates are relations.Each triple,e.g.,(Obama,Place-Of-Birth,Honolulu),represents a fact.When applied to Q&A systems,a knowledge graph can provide desired answers only when it covers the corresponding facts.Although there have been many large-scale open-domain knowledge graphs,they are still far from completeness,e.g.,in Freebase,30% person entities miss the triples about their “parents”.Knowledge graph completion is tasked to add novel triples that represent worldly facts into an existing knowledge graph.There are mainly two channels of information sources for knowledge graph completion:1.reasoning about novel triples from the existing triples of a knowledge graph.2.extracting novel entities and triples from text.To exploit the first channel,there has been a surge of interests in knowledge graph embedding which learns a dense vector for each entity and computes the plausibility of a triple with the vectors.The embedding models can be used to reason about the triples that an extractor harvested from text.As the two channels are complementary,the combination of an embedding model and an extractor drastically outperforms either of them.We summarize the weaknesses of existing knowledge graph embedding models and the challenge of combining an embedding model with an extractor as follow:1.The state-of-the-art knowledge graph embedding model—Trans E cannot well handle the relations of reflexive,one-to-many,many-to-one,or many-to-many mapping.2.In the training of a knowledge graph embedding model,existing negative sampling algorithms are likely to generate false negative examples.3.For a triple extracted from text,its subject and object are words.If either the subject or the object cannot be linked to an entity of the considered knowledge graph,the existing embedding model is unable to reason about the triple due to the lack of entity vectors.In this thesis,we will propose a series of techniques to tackle the above issues.Our contributions are summarized as follow:· We show that the first weakness stems from that Trans E models each relation as a translation operation on the entity vectors.Thus,we propose a novel embedding model—Trans H which sovles the weakness by projecting entity vectors onto a relation-specific hyperplane before applying the translation operation.Meanwhile,Trans H avoids increasing too much of the model complexity.· We propose a data-driven relation-specific distribution to sample negative examples for training a knowledge graph embedding model.The proposed distribution reduces the chance of sampling false negative examples.Meanwhile,its parameters can be directly determined by basic statistics of each relation.· We first show that,in the word embedding model—Word2Vec,the implicit relations between words can be interpreted as translation operations on word vectors like Trans E.Thus,we propose a joint embedding model which learns a dense vector for both each entity and each word.Our joint embedding model can compute the plausibility of a triple that simultaneously involves word and entity.To the best of our knowledge,our joint embedding model is the first approach that is capable of handling such kinds of triples.· We propose three alignment models based on entity linking,entity name,and entity description respectively.All the supervision for training the alignment models is easily available and large-scale.Empirical evaluations show that the alignment models can effectively align the vector space of words with the vector space of entities.We conduct extensive experiments to compare proposed models with the baselines.Experimental results show that our approaches outperform the state-of-the-art methods and detailed analyses confirm the motivations of proposed models.
Keywords/Search Tags:knowledge graph, embedding, relational fact extraction
PDF Full Text Request
Related items