Font Size: a A A

Research On Key Issues Of Document-Level Entity Relation Recognition

Posted on:2021-05-19Degree:MasterType:Thesis
Country:ChinaCandidate:T WuFull Text:PDF
GTID:2428330605474757Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Entity relation recognition is a classic problem in the field of natural language processing.It stems from the demand for structured transformation of unstructured data.It aims to identify semantic relations between pairs of entities based on a given natural language text.It becomes an automatic way to obtain data for downstream tasks such as knowledge graph and automatic question answering.Currently,the task of entity relation recognition is mostly concentrated at the sentence level,but according to the habit of natural language,it is very common for entities to span multiple sentences.Entity relation recognition should be a discourse-level task.There are obvious limitations for sentence-level research.Due to the limitation of discourse-level entity relation annotated corpus,there are few studies on discourse-level entity relation recognition.Analysis of existing researches can find two limitations in discourse-level entity relation recognition.On the one hand,there is a problem of insufficient capture of syntactic information and long-distance dependencies.On the other hand,there is a lack of effective integration with other discourse-level information(such as coreferential relation),and there is insufficient analysis of the global dependence among various information at the discourse-level.In order to address the above problems,this paper proposes corresponding approaches from the following three aspects:?.In order to break the stalemate of the shortage of corpus resources and the existing resources only focus on the biological field,we use synonymy reasoning to construct two discourse-level entity relation recognition corpora covering the fields of national defense science and technology and news.Aiming at the field of national defense science and technology,we propose a semi-automatic annotation strategy combining intra-discourse and cross-discourse.Furthermore,we list detail analysis of the achieved corpus from multiple perspectives.In the field of news,we conduct the annotation work based on the ACE 2005 corpus,which is the most commonly used corpus in entity relation recognition tasks and has annotated the coreferential information.In this way,we construct two corpora of discourse-level entity relations.?In view of the current problem of insufficient syntactic information and long-distance dependency capture,we incorporate additional features such as syntax and coreference into graph information,and adopt the graph convolution model to encode graph information.Considering the different contributions of different features to the task,we further combine graph convolution with multi-head attention mechanism to construct a graph attention convolutional model.The experiment results on three corpus verify the effectiveness of the proposed approach.?.In view of the lack of analysis of the global dependence among discourse-level information,taking coreferential information as an example,we propose three ways to explore auxiliary entity relation recognition task with coreferential information.Three ways are direct entity relation recognition at the discourse level,derivation of coreference based on intra-sentence recognition,and multi-task learning between coreference resolution and entity relation recognition.The experimental results show that the recognition of entity relations at the discourse level is better than the intra-sentence reasoning approach.At the same time,multi-task learning approach also achieves certain success.
Keywords/Search Tags:Entity Relation Recognition, Discourse-Level, Corpus Resources, Graph Attention Convolutional Network, Multi-Task Learning
PDF Full Text Request
Related items