Font Size: a A A

Research And Application Of Entity Link System

Posted on:2021-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:F LiuFull Text:PDF
GTID:2438330620964111Subject:Engineering
Abstract/Summary:PDF Full Text Request
In the era of big data,the extraction and understanding of text knowledge has become increasingly important.Chinese vocabulary is diverse and ambiguous.Therefore,how to correctly understand text and obtain accurate language information is one of research hotspots in the field of NLP.The entity linking is a process which selects entity references in a given text and links the entity references to the target knowledge base.It mainly solves the ambiguity and diversity of Chinese words,what's more,improving algorithm accuracy will enhance text understanding.Based on the analysis of related technologies of entity linking,we deeply research the related technologies of candidate entity generation and entity disambiguation,The main research contents are as follows:(1)This thesis constructs an entity reference expansion algorithm.Traditional entity linking system directly uses entity references to search the knowledge base.That will result in a low system recall rate.so we proposed a entity reference expansion algorithm to obtain the most accurate Chinese representation of entity references,which can improve the recall rate of candidate entities.(2)The expansion of entity allegation will bring too many candidate entities and reduce the overall entity linking system rate.Therefore,we build a graph model filtering algorithm that integrate shallow semantic information.Traditional algorithms filter candidate entity with a single feature,that will lead to the candidate entity recall rate reduce.The candidate entity filtering algorithm proposed in this thesis can minimize the size of candidate entities and improve the system efficiency,while ensures the recall rate of the candidate entities in a high level.(3)This thesis constructs a disambiguation model based on a fine-grained entity classification model.The traditional model directly predicts the disambiguation entity,that will cause the model overfit,so we convert the entity disambiguation problem into an entity classification problem.What's more,we improve the modeling method,using the attention mechanism embed location information to obtain the deep semantic expression of the entity reference.The fine-grained category of the candidate entity is used for entity disambiguation too.It is proved through experiments that the proposed disambiguation algorithm has improved significantly in accuracy.(4)Aiming at the lack of Chinese corpora,we first preprocesse the Chinese Wikipedia data,build the data foundation for the subsequent entity linking algorithm.We also build a display and application system for the entity linking algorithm,which uses a top-down design method.In addition to showing the details of the entity linking algorithm,the system also shows the related algorithm flow of Chinese Q&A.Proving the practicability and scalability of the algorithm in this thesis.
Keywords/Search Tags:entity linking, chinese wikipedia, deep learning, graph model, memory network
PDF Full Text Request
Related items