Research On Long Tail Problem And Cross-Sentence Relationship In Document-Level Relation Extraction

Posted on:2024-06-16

Degree:Master

Type:Thesis

Country:China

Candidate:S Liu

Full Text:PDF

GTID:2568307067493094

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Screening the entities of interest and their relationships from the massive Internet information texts is a current research in the field of natural language processing information extraction,and an important part of building knowledge graphs.Previously,entity relationship extraction mostly focused on sentence level,but in practical scenarios such as news and company announcements,entity relationship extraction is more at the document level.In the task of document-level entity relationship extraction,documents are long and there are many types of entity relationships,so the long-tail problem is very common.In addition,many entity relationships exist in multiple sentences,and the cross-sentence problem affects the extraction performance of the model.To address the long-tail problem,the first work in this dissertation proposes a model for entity relation generation that combines prompt templates and decoders in a sequence-to-sequence architecture.By directly generating semantic labels and then classifying entity relationships,the model avoids the mapping process of semantic labels to category labels and can decode multiple types of relationships among entities with only one encoding.Experiments on three document-level datasets demonstrate that the method achieves the best performance in the small-sample type metric Macro F₁,and also achieves a very competitive performance in the overall performance Micro F₁ metric.Compared to single sentences,document-level text contains longer contexts,more entities,and more complex entity interactions.The second work in this dissertation proposes a cross-sentence entity relationship extraction model based on entity interactions and coreference resolution.Specifically,the work uses an entity interaction module based on an attention mechanism instead of using graph neural networks to model interactions between entities,thus avoiding the problem of information loss caused by predefined edge-building rules.At the same time,the work uses external NLP tools to annotate the indicative pronouns in the document and uses an attention mechanism to incorporate this referential information into the entity representation.Experiments show that the model is faster and more effective than the best current graph-based approaches.Unlike the previous two works that predict the relationships between entities given labeled entities,considering that entities are not always available in realistic scenarios,the third work in this dissertation proposes a serialization approach with shared relationship names,implements an end-to-end document-level entity relationship triad generation model,and alleviates the category imbalance problem in document-level triplets extraction by a two-stage training strategy.Experiments on the document-level dataset validate the effectiveness of the approach,achieving the best performance so far on two evaluation metrics.

Keywords/Search Tags:

Long-tail Problem, Entity Relation Extraction, Prompt Learning, Document-Level Text

PDF Full Text Request

Related items

1	Research On Methods Of Relation Extraction Based On Relation Correlations
2	Long-Tail Relation Extraction Based On Ensemble Learning
3	Research On Document-level Long Text Relation Extraction Algorithms
4	Research On Technologies Of Document-level Relation Extraction For Long-distance Entities
5	Research And Implementation Of Prospectus-based Corporated Relation Recognition System
6	Research On Document-level Entity Recognition And Relation Extraction Method
7	Research On Document-level Entity Relation Extraction Method Based On Deep Learning
8	Research On Document Level Event Extraction Method Based On Prompt Learning
9	Document-level Entity Relation Extraction Based On Document Structure And External Knowledge
10	Research Of End-to-end Joint Entity And Relation Extraction