Font Size: a A A

Research And Implementation Of Prospectus-based Corporated Relation Recognition System

Posted on:2024-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:W L ZhaoFull Text:PDF
GTID:2568306944956989Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet,the demand to automatically extract entities and relations from massive prospectuses and build a prospectus-based corporate relation recognition system is becoming increasingly strong.Researchers pay more and more attention to Entity Relation Extraction,which is the key task of building the enterprise relation recognition system.Entity Relation Extraction is divided into two subtasks:Entity Identification and Relation Extraction.The goal of Joint Entity and Relation Extraction models is to use an end-to-end model to complete both Entity Identification and Relation Extraction tasks,extracting entities and relationships from texts.For now,researchers have addressed the problem of overlapping entity identification and been studying the dependency between subtasks.Following them,this article further explores the dependency of two subtasks.First of all,through experiments,the thesis finds that Entity Identification and Relation Extraction are interdependent.Then,the thesis designs and implements Double Information Branch for JERE(DIB),using the double-branch convergence structure to learn the dependencies between two subtasks in the forward propagation and the back-propagation respectively.Finally,experimental results show that DIB has achieved superior results in the prospectus dataset and open academic dataset.When applying entity relation extraction technology to industry,the long-tail problem cannot be ignored,where existing solutions often sacrifice the effect of head classes.To solve the problem,the thesis decomposes the long-tail problem into two sub-problems:data imbalance problem and few-shot learning problem of tail classes.Then,the thesis proposes Meta-based Dual Expert for Relation Extraction(MDERE).To alleviate the problem of data imbalance,MDERE adopts a dual-expert structure to reduce the sacrifice of head relations.For the few-shot learning problem of tail classes,Multi-label Oriented Meta Learning is proposed to optimize the learning process of the tail expert.Finally,experimental results show that MDERE achieves SOTA macro-F1 in the prospectus dataset.Last but not least,the thesis designs and implements a prospectusbased corporate relation recognition system using the above methods.The report of system testing verifies the feasibility of the research methods in the implementation of the enterprise relationship recognition system.
Keywords/Search Tags:entity relation extraction, multi-task learning, task dependency, long-tail problem, meta learning
PDF Full Text Request
Related items