Font Size: a A A

Research And Implementation Of English Entity Discovery And Linking System Based On Freebase

Posted on:2020-08-20Degree:MasterType:Thesis
Country:ChinaCandidate:H YangFull Text:PDF
GTID:2428330575457048Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous development of the Internet,people use the Internet to carry out various kinds of communication.More and more unstructured texts such as news and encyclopedia appear on the Internet.The processing and mining of this information can help people better understand the text content,grasp valuable information,and strengthen effective communication between people.Named entity recognition and entity linking as the key technologies related to processing entities in text are also receiving more and more attention from researchers at home and abroad.Named entity recognition is a task that identifies special terms such as person name,organization name,place name,time,etc.that appear in the text.Entity linking is the process of linking these entities to an unambiguous entity in the knowledge base.These entities have a great help in the understanding of text content and play an important role in information extraction,automatic question answer,machine translation and other tasks.In recent years,large-scale knowledge bases have been continuously applied in natural language processing,and knowledge graph related technologies have developed rapidly.Named entity recognition and entity linking technology have also been continuously developed as one of the core technologies for constructing and applying knowledge graph.The main problems faced by this task is the diversity and ambiguity of the entities in the text.An entity name can represent multiple existing entities,and an entity can have multiple names at the same time.In order to overcome the difficulties brought by the entity diversity and ambiguity to the named entity recognition and entity linking task,a named entity recognition method based on long short-term memory network and conditional random field and a neural network entity linking method based on Freebase knowledge base are proposed.The linking entity is selected from the set of candidate entities.This method combines the mention context with the entity description text to reduce the impact of entity ambiguity.The accuracy of this method on the AIDA CoNLL-YAGO corpus and TAC KBP-2017 named entity recognition and entity linking evaluation corpus reached 88.2%and 83.7%,respectively.Our named entity recognition method reached 0.91 F1 on the CoNLL-2003 corpus.The visual analysis of the model parameters also verifies that the structured self-attention mechanism and memory network applied by the model can extract key information from the mention context and entity description text that is beneficial to the entity linking.The main contributions of this thesis are as follows:1.A named entity recognition model based on bi-directional long short-term memory network and conditional random fields is proposed to identify the mention in the text.The method uses the long short-term memory network to automatically discover the effective features of the text,and uses the conditional random field algorithm to obtain the optimal solution of the sequence.2.An entity linking model based on the structured self-attention mechanism and memory network is proposed.The model uses the structured self-attention mechanism to obtain the valid information in the mention context and the entity description text,and uses the memory network to obtain the interaction information between the mention and the mention context and the entity and entity description text.3.A named entity recognition and entity linking system is constructed to realize the identification of the mention in the text and link it to a specific entity in the Freebase knowledge base.4.Experiments were conducted in two standard data sets,and the results show that the proposed method is comparable to the current advanced neural network entity linking methods.Through the analysis of the parameter weights,the model verifies the interpretability of the information from the mention context and the entity description text.
Keywords/Search Tags:named entity recognition, entity linking, freebase, structured self-attention mechanism, memory network
PDF Full Text Request
Related items