Font Size: a A A

Entity Linking Algorithm Research And System Implementation Based On Wikipedia

Posted on:2017-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:N LuoFull Text:PDF
GTID:2308330485470216Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Internet has entered era of information explosion, how to exactly retrieved information of user needs from massive information is the urgent problem to solve, but ambiguity problem is widely existent in natural language, entity ambiguity refers to the language phenomenon that the same entity mention corresponds to the different real world entity in different context. Eliminating entity ambiguity can help to better understand text information, Entity Linking technology is that correctly linking person name, place name, organization name in text to unambiguous entity in knowledge base, mainly solve the synonyms and polysemy entity disambiguation problems, it has great influence on information retrieval, question answering and completing knowledge base.Aiming at the core problem candidate entity ranking in entity linking, the paper conduct study, the major work and innovation are proposed as follows.1. Traditional candidate entity ranking often stop in the stage of feature extraction, it needs to extract a large number of features, and is trained by the supervised learning methods, that is very complex, and the features are often shadow features, such as string matching, neglect the semantic similarity between entity and entity. Aiming at the problems above, The paper use the link structure of entity in wiki, simultaneously consider the entities under same topic are linked together, the entities of the same semantics are also linked together, according to this idea, the paper proposes two candidate entity ranking algorithms, the algorithm of merging LDA and random walk with start and merging Word2Vec and PageRank, two algorithms both use the structure of graph of entity in wiki, the result of random walk with start is the probability vector of entity, but the result of PageRank is the PR value of entity, the former merges the feature vector of entity about topic, the latter merges the semantic similarity between entity and entity, two algorithms both merge the semantic feature based on graph structure, through the experimental verification, improve the accuracy of entity linking2. Combined the two candidate entity ranking algorithms, develop entity linking system-LEL, the system can link the entity in text to knowledge base and has a strong interactivity.
Keywords/Search Tags:entity linking, entity disambiguation, knowledge base
PDF Full Text Request
Related items