Font Size: a A A

Research And Application Of Entity Linking Algorithm Based On Multi-source Information

Posted on:2021-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z H YangFull Text:PDF
GTID:2428330623467766Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Entity Linking is an essential technology in the field of knowledge graph.It aims to map the entities mentioned in unstructured text to the entities stored in the knowledge base one by one,which can help the computer understand natural language more accurately.Entity linking is widely used in scenarios such as knowledge base expansion,information retrieval,intelligent question answering,content recommendation.It is one of the hottest research issues in the field of knowledge graphs.This thesis investigates domestic and foreign entity linking researches based on graph structure and entity embedding.Focusing on problems such as the candidate for linking are too noisy,candidate entity coherence measurement is inaccurate,similar candidate disambiguation is difficult,we proposed two collective entity linking algorithms based on graph structure are proposed.Specifically,the main work of this thesis includes:1.Proposing a collective entity linking algorithm based on LeaderRank LRCEL,which mainly includes four main modules: entity recognition module,candidate generation module,entity association graph construction module,and candidates ranking module.The algorithm will use the potential semantic information contained in input text and local knowledge base at the same time,to generate a small and accurate candidates set.Then,constructing an entity association graph by the candidate set,which contains the strength of semantic relationships of the candidate entities.Finally using multi-source information contained in association graph and LeaderRank to sort the candidates,and selecting a group of candidate as the final link objects to the mentions in the input.The experimental results show that compared with the classic collective entity linking method Babelfy,LRCEL has more advantages in candidate generation,entity coherence measure,and other aspects.The overall link performance is also better,average F1 score has increased by 11 %.2.Proposing a collective entity linking algorithm based on entity embedding EECEL,EECEL uses random walk and word embedding model to generate an entity embedding corresponding to each entity in the knowledge base,and then directly uses the entity embedding to calculate the coherence between candidate entities,improving the entity coherence feature expression in entity association graph.The algorithm also uses entity embedding to generate the topic vector of the input text,and uses the topic vector to optimize the algorithm's candidate generation and candidates ranking module.Experimental results of three data sets based on two knowledge bases,showing that EECEL performs better than LRCEL,and the average F1 value increases by 2%,proved that entity embedding can help entity linking algorithms to achieve better results.Subsequent research about LRCEL and EECEL algorithms will focus on three aspects: context selection method,local knowledge base densification,and entity mention recognition method,in order to improve the performance of LRCEL and EECEL algorithms at this stage.
Keywords/Search Tags:entity linking, ambiguity resolution, random walk, entity embedding, knowledge graph
PDF Full Text Request
Related items