Font Size: a A A

Domain Entity Disambiguation And Link Prediction Based On Representation Learning

Posted on:2019-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:X J MaFull Text:PDF
GTID:2438330563457662Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the field of natural language processing,it is often necessary to digitize some of the symbols.The common method is to represent the heat symbolically,that is,to represent words as a very long vector,with a value of 1 in one dimension and 0 in other dimensions.Size is the size of the vocabulary.However,this representation can bring disasters to dimensions.At the same time,words and words are isolated and there is no semantic connection.The knowledge base built by people is usually expressed as a network form,and the nodes represent Entities and edges represent the relationship between entities.Under the network representation,people usually need to design a special graph algorithm to store and use knowledge,which has the disadvantage of being time-consuming and laborious,and is plagued by data sparseness problems,indicating that the learning is very good.The above problem is solved by stating that the idea of learning is to represent semantic information as a dense,low-dimensional,real-valued vector,that words represent learning in the form of words as a word vector,and that knowledge representation learning is a triple in the knowledge base.Representation learning,representing the entities and relationships in the triples,and the technique effectively calculates the semantic relationships between the entities.Knowledge reasoning and knowledge acquisition and so has a unique advantage.This dissertation focuses on the following areas in the field of entity disambiguation and knowledge map completion in the construction of a specific domain knowledge map.1.Fusion of multi-strategy and word representation of the field of hyponymy relations to obtain.The relationship between the subordinate and subordinate entities in the field is the skeleton of the knowledge map,and the relationship between the subordinate and subordinate entities determines the depth of the knowledge map.In this paper,we propose a method to capture the superior and inferior relations in the field by combining multiple strategies and word representations.Firstly,the entity pairs of hyponymy candidate entities are extracted from semi-structured texts and unstructured texts,and then the SVMs are used to verify the pairs of hyponymy bit-related entities.The vector vectors are trained to represent the candidate entities as Vector,the use of the vector offset difference vector clustering operation,thesemantic similar to the hyponymy bit entity relationship.2.Domain entity disambiguation.In this paper,domain entity disambiguation of fusion word vector and LDA topic model is proposed.Word vectors are used to obtain the vector form of referent and candidate entity from contextual text and knowledge base respectively.Contextual context similarity And category reference similarity calculation.The LDA topic model and Skip-gram word vector model are used to obtain the word vector representation of the different meanings of the polysemous words.Key words in the topic area are extracted to calculate the topic topic similarity in the field.Finally,the three types of features are fused,The candidate with the highest degree of similarity is the final target entity.3.Based on the learning model of the entity link prediction.In this paper,the real-world link predictions of TransE-based translation models and TransR models are proposed.The TransE and TransR models are learned by using knowledge representation,and the knowledge map is mapped to low-dimensional vector spaces to express the triples in the knowledge map of tourism.Then we use the representation learning model to predict the linkages of domain entities to verify the effectiveness of the two models in the linkage of domain entities.
Keywords/Search Tags:knowledge graph, relation extraction, word embedding, entity disambiguation, LDA topic model, Representation learning
PDF Full Text Request
Related items