Font Size: a A A

Research And Application Of Zero-shot Entity Linking

Posted on:2022-12-13Degree:MasterType:Thesis
Country:ChinaCandidate:X Y WangFull Text:PDF
GTID:2518306779971999Subject:Tourism
Abstract/Summary:PDF Full Text Request
Entity linking refers to linking the mention in natural language text to the target entity in a given knowledge base.As a key technology in the field of natural language processing,it provides entity understanding services for downstream tasks such as information extraction and knowledge question answering.Although the traditional entity linking technology has achieved good results in some general fields,it cannot be directly extended to other specialized fields because it relies on a large number of manually annotated data sets and various semantic knowledge of entities in the knowledge base.Therefore,some recent work has started to investigate zero-shot entity linking in specialized fields using only descriptions of entities in the knowledge base and annotated datasets of generic fields.Entity linking mainly includes two stages: candidate generation and candidate ranking.Although some progress has been made in the current zero-shot entity linking work,it still has the following two challenges:(1)In the candidate entity generation stage,due to the excessive pursuit of efficiency,the current work does not fully consider the interaction between the mention text and the entity description,resulting in a low recall;(2)In the candidate entity ranking stage,the current work only considers the relationship between each candidate entity and mention separately,without consider all candidate entities together,which affects the overall accuracy.To address these issues,we propose a zero-shot entity linking method based on ColbertEG and MCRC-ER models.Specifically,the main contributions are as follows:(1)Firstly,to address the problem that the interaction between mention text and entity description is not fully considered in the candidate generation stage,we propose a candidate generation method based on Colbert,which makes full interaction between the mention text and the entity abstract under the condition of ensuring efficiency through the way of late interaction.(2)Secondly,to address the problem that all candidate entities are not considered as a whole in the candidate entity ranking stage,we propose a candidate entity ranking method MCRC-ER,which models the candidate entity ranking as a multiple-choice problem and uses a model based on multiple choice reading comprehension to rank the candidate entities.While considering all candidate entities as a whole,it also enables a deeper interaction between the information of mention and candidate entities through the encoder.(3)Finally,we design and implement a tourist attraction entity linking system,which is based on our proposed zero-shot entity linking technology to help understand the attractions described in the travel guide and support downstream tourist applications such as attraction guiding and route planning.The whole system is consisting of two stages:offline processing stage and online processing stage.The offline part is mainly used to train the candidate entity generation and candidate entity ranking models based on the CCKS2019 entity linking dataset,build a CN-DBpedia-based knowledge base of tourist attractions,and store tourist attraction entity vectors using Milvus vector database.The online part is mainly used to implement named entity recognition of tourist attractions and entity linking of tourist attractions based on Spacy's travel guide text.The system demonstrates the effectiveness and practicality of the method proposed in this paper.
Keywords/Search Tags:Entity linking, Zero-shot entity linking, Candidate generation, Candidate ranking
PDF Full Text Request
Related items