Font Size: a A A

Research For Algorithm Of Chinese Entity Linking Technology Based On Topic Relation Graph

Posted on:2018-07-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2428330623950638Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet,network text shows exponential growth.The polysemy and ambiguity of words bring serious challenges to the accurate understanding of text information.Entity linking technology provides a solution to this challenges by mapping the ambiguous entity expressions in the text with the exact entries in the knowledge base.Entity linking technology is widely used in the fields of knowledge base extension,Machine Translation,question answering system,information retrieval,and other fields.It is a promising research direction.In this paper,we design a Chinese entity linking system TRGEL based on the topic consistency.We propose three algorithms to calculate the topic correlation between entities,and propose a new method of ranking and linking.The main research results and innovation on this dissertation are listed as follows:(1)We designed a Chinese entity linking system TRGEL based on topic consistency,which mainly consists of candidate entity generation module,TRG graph construction module and ranking module.We construct the entity relation graph TRG of the given text by using the latent topic semantic information of the co-occurrence entity.Finally,by calculating the maximum score of the topic subgraph in the TRG graph,we select a set of candidate entities which have the maximum relevance and the content similarity of the given text topic as the entity linking target to realize batch linking.(2)Aiming at the insufficiency of the text topic and semantic information of the entity,this paper proposes a method to construct the entity topic relation graph TRG based on the feature model.We use TFIDF keyword,LDA topic vector and W2 V multidimensional word vector to construct the weight set of TRG graph,and provide the basis for subsequent rankings and linking.In the experiment,we compare and analyze the three weight algorithms.Among them,Word2 Vec multidimensional word vector algorithms performs better on the news text and Weibo text test set,which shows the stability of the algorithm.(3)The single entity linking method has low link efficiency and does not take into account the association between co-occurrence objects.Combining TRG graph,we propose a method of inference linking using maximum score topic subgraph.First,we calculate the topic consistency between candidate entities to form a score vector.And then we achieve the batch linking with the score vector.In the experiment,compared with the popular inference linking method,this method is simpler and more accurate.
Keywords/Search Tags:Entity Linking, Entity Disambiguation, Topic Feature, Similarity Calculation, Word2Vec, LDA
PDF Full Text Request
Related items