Font Size: a A A

Research On Person Entity Linking For Different Scenarios

Posted on:2024-06-20Degree:MasterType:Thesis
Country:ChinaCandidate:P ZhouFull Text:PDF
GTID:2568306938979749Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Entity Linking,as an important task in natural language processing,aims to accurately link entity mentions to the corresponding entity nodes in the knowledge graph.On the one hand,among many data sources,the encyclopedic corpus is an ideal choice for entity linking data sources due to its extensive knowledge coverage,high-quality semi-structured information and rich contextual support,and on the other hand,among the various types of entities,person entities carry the main factual components of the knowledge graph,but the presence of a large number of renames makes linking much more difficult.Therefore,this thesis constructs multiple person entity linking datasets for different scenarios based on the encyclopedic corpus,and proposes adaptable person entity linking models.The main contents of this thesis are as follows:(1)Research on person entity linking for personal experience descriptionsIn practical applications,there are many web pages or documents describing personal experiences where the person objects described need to be disambiguated due to the inevitable phenomenon of renaming.Therefore,in this scenario,the goal of the research is to accurately link the person objects described by personal experiences to the corresponding entity nodes in the knowledge graph.However,existing conventional entity linking datasets are relatively limited in size on the one hand,and on the other hand,the inclusion of multiple types of entities in the dataset leads to an uneven distribution of entities,making it difficult to use them for training person entity linking models.In this thesis,two large-scale datasets are constructed using the encyclopedic corpus for testing the performance of various baseline methods in this scenario,and proposing a person entity linking model combining information interaction.The experimental results show that the model proposed in this thesis can effectively establish links between person objects described by personal experience and knowledge graph nodes,and shows better performance than baseline methods on these two datasets.(2)Research on person entity linking for context orientationIn another scenario,there are a large number of person mentions in person-related web pages or document texts that also need to be disambiguated.Therefore,this research aims to accurately link person mentions in web pages or document texts to the corresponding entity nodes in the knowledge graph.Most existing entity linking methods use encoded entity mention contexts and pre-trained candidate entity embeddings for entity linking,however this method reduces the linking process to a problem of semantic matching and candidate entity ranking.This means that the learning of fine-grained interactions between the both by the model is relatively superficial.To alleviate this problem,this thesis proposes a multi-turn multi-choice framework based on generative model which fully interacts mention context and candidate information in the encoding process,and considers the semantic consistency between multiple mentions.At the same time,a more extensive and general dataset is constructed based on encyclopedic data.Experimental results show that the method outperforms baseline methods and performs better on unlinkable mentions.(3)Build a person entity linking system for encyclopedic textsThis thesis uses the encyclopedic web corpus to construct large-scale model training data and person knowledge graph.Based on the above research,this thesis also builds an online person entity linking system for encyclopedic texts.The system has a lightweight structure and supports users to quickly query the linked information to meet their knowledge needs.In summary,this thesis focuses on the person entity linking task,conducts an in-depth research on the problem in different scenarios,and uses the encyclopedic corpus to construct several person entity linking datasets to suit the needs of these scenarios.Firstly,in scenarios where person objects described by personal experiences need to be disambiguated,this thesis proposes a linking model combining information interaction and validates the effectiveness of the method through experiments.Then,in scenarios where there are a large number of person mentions in a web page or document text that need to be disambiguated,this thesis proposes a multi-turn multi-choice framework based on generative model to solve the problem from a new perspective.Finally,this thesis combines the above research to build an online Chinese person entity linking system,which provides strong support for practical applications.
Keywords/Search Tags:Knowledge Graph, Encyclopedic Corpus, Entity Linking, Entity Linking System
PDF Full Text Request
Related items