Font Size: a A A

Research On Entity Linking For Domain Text Resources

Posted on:2016-09-13Degree:MasterType:Thesis
Country:ChinaCandidate:H Y XueFull Text:PDF
GTID:2308330461450807Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the rapid development of information age, network has become the major approach for human beings to get information. In recent years, the appearance of large collaborative resources Wikipedia and knowledge bases boiled on it, such as DBpedia, Freebase and YAGO2, have promoted new tasks, such as the research based on Wikipedia knowledge, including document classification and clustering, semantic discovery, entity linking. Among them, the entity linking is a process in which the natural language text fuzzy entities will be linked to a set of known target entities of the knowledge base. For ordinary users, a large number of professional terms in the teaching resources cannot be understood, and it makes reading and learning very difficult. Through entity linking, the names in free text will be linked to one of the most suitable related entities in the knowledge base, helping beginners know about the definition of academic terms more quickly and conveniently, and improving reading experience. Therefore, the entity linking for the domain text has very important significance.This paper firstly describes the research background and significance of entity linking and the research status at home and abroad, introduces several typical entity linking algorithms, analyses the characteristics and shortcomings of different entity linking algorithms. It also illustrates some problems existing in current entity linking algorithm, such as: previous Methods of high efficiency using relatively simple context features to avoid the complexity of computing; methods of high accuracy use more context features and being much slower. Then, this paper proposes an entity linking method TSELG(Two Stages of the Entity Linking on Graph).The first stage, link the easy annotation which are clearer, and determine the domains of the text according to the linked entity; the second stage, use the determined entities and domains to link the remaining annotation. In order to guarantee the accuracy of the entity linking algorithm, this paper extracted Wikipedia information for the similarity calculation of entity linking.Finally, in order to validate algorithm the TSELG entity linking algorithm the paper have put forward, we use two current systems for comparison by testing on the real data. Experiments show that this method achieves a higher linking accuracy and efficiency compared with the classical algorithm.
Keywords/Search Tags:Entity linking, Semantic similarity, Wikipedia, Knowledge Base
PDF Full Text Request
Related items