Font Size: a A A

Research And Application Of Entity Disambiguation Method Based On Named Domain Knowledge Graph

Posted on:2022-12-10Degree:MasterType:Thesis
Country:ChinaCandidate:R T DuanFull Text:PDF
GTID:2518306764980099Subject:Journalism and Media
Abstract/Summary:PDF Full Text Request
In recent years,with the increasing number of personal computers and mobile personal terminals,a large amount of data is generated all the time.In order to better process and utilize these data,Google firstly proposed the concept of knowledge graph and developed its own knowledge graph system.A large amount of data needs to be introduced in the construction of knowledge graph,and the data from different data sources have multi-source heterogeneity.The same named entity often has different expressions in different domains,so the entities formed by the domain knowledge graph subgraph are inconsistent.In order to reduce entity node redundancy and conflict in domain knowledge graph,thesis improves the end-to-end referential resolution model.Since the word vector constructed by the word embedding model of the classical model is a static word vector,it has poor effect on the task of polysemy in the reference resolution task.Thesis expects to introduce dynamic vector to improve the performance of the model.A new end-to-end coreference resolution model is proposed from the perspectives of introducing external context information and adding dynamic information internally.And a visual system is designed to facilitate the performance evaluation of the model and improve the model.The main work contents are as follows:(1)The End-to-End Model with a Priori Vector(ETEPV)is proposed,the model is improved based on the classic end-to-end referential resolution model.Context vector of each word was constructed using Bert word embedding model,and prior vector was constructed based on similarity between context vector and candidate co-reference pair feature vector.Dynamic context information was introduced from outside to improve the model's judgment ability of candidate co-reference pair.In the experiment on Onto Notes 5.0,the average accuracy and recall rate of the model improved by 2.5% and0.5% compared with the baseline model in the task of common finger resolution.(2)Based on the above model,dynamic word vector is introduced to improve the ability of reference extraction.Span Bert high-performance encoder suitable for extracting span features was used to replace Glo Ve model to introduce dynamic information into span feature vector construction,thus improving the model's ability of extracting mention.On this basis,Bi-GRU model is used to extract context information from dynamic word vector and construct feature vector of span.Other deep learning components do not change,and finally the algorithm model is verified on the open data set,which improves the average accuracy by 1.7% and the average recall rate by 2.2%on the basis of the original model.(3)A visual system of end-to-end co-finger digestion model was designed and implemented so that users can intuitively judge their own model performance.In addition,users can annotate text data through the system and input it into the system test model to improve the completion degree of different co-signatory resolution tasks,so as to improve the iterative network model.
Keywords/Search Tags:Knowledge graph, Coreference resolution, Embedded word, The neural network
PDF Full Text Request
Related items