Font Size: a A A

Research On Entity Relation Extraction Technology Based On Deep Learning

Posted on:2022-12-04Degree:MasterType:Thesis
Country:ChinaCandidate:C R DengFull Text:PDF
GTID:2518306779996589Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Massive amounts of text data are generated every day in the current era of rapid Internet development.In the domain of natural language processing,how to format these unstructured texts and extract valuable information from them has become a hot research topic recently.After years of research by a wide range of scholars,a more effective approach is to build knowledge graphs,and the key implementation technique is entity relationship extraction,which aims to identify entities from text and determine the semantic relationships between them based on the contextual background.Previous statistical-based approaches were time-consuming and had mobility problems.Deep learning techniques have gradually been applied to this task in recent years,and while many achievements have been produced,there are still certain limitations.Most current methods,for example,rely on static word vectors,which also have insufficient semantic representation and the inability to express multiple meanings of a word;pipeline-based methods for extracting entity relationships have error propagation and insufficient correlation between two sub-tasks.In this thesis,the main work is described as follow:(1)The research background and impact of entity relationship extraction are reviewed,as well as the present state of domestic and international research,the limitations and drawbacks of previous methods,and the relevant theories and techniques for this task.(2)Addresses the incapacity to express polysemous words and the semantic representation capability of static word vector texts.The effect of text representation on model effectiveness is examined,as well as pre-trained language models,utilizing BERT to construct word vectors that are implicitly rich in contextual information.Experiments on the Web NLG dataset indicated that the F1 metrics improved by 2.3% when compared to the ETL-Span model.(3)In the pipeline-based entity relationship extraction method,to solve the issues of error propagation and insufficient correlation between two subtasks.A new annotation scheme is proposed,based on which joint extraction is conducted via parameter sharing to enhance interactivity between the two subtasks while also accomplishing the entity relationship extraction effect.On the Du Ie dataset,experiments were completed,and the F1 metric improved by 1.3% over the FETI model.The innovation of this thesis include:1.Using the pre-trained language model BERT for text representation,a pointer network-based entity-relationship extraction model was constructed.The pointer network has been used to predict the entity and relation types in the sentence,and then the word vectors,the entity and relation types are input into Bi GRU to extract the semantic features implied in the sentence,and finally the global matrix output from BERT is utilized to direct the model through entity relationship extraction.Experiments were conducted out using the publicly available Web NLG English dataset.The F1 metric of the model is 85.4%,the accuracy rate is 87.2% and the recall rate reaches 83.7%,which is an improvement of 2.3%,23.8% and 29.0% in the combined metric F1 values compared to the ETL-Span,Order RL and Copymtl models respectively.2.A joint model is constructed based on Ro BERTa.To incorporate entity subject-object features and entity type information into the model,a new annotation system is provided,based on which joint extraction is conducted through parameter sharing.Using the Ro BERTa pre-trained language model and the Bi LSTM neural network,the model's performance is improved.Experiments were completed on the publicly available Chinese dataset Du IE,The F1 metric reached 77.1%.Compared with the FETI,MHS and WDce models,the combined index F1 values improved by 1.3%,8.1% and 18.4% respectively,and the proposed model's advantages were validated by the empirical results.
Keywords/Search Tags:Entity relationship extraction, Natural Language Processing, Attention mechanisms, Pre-trained language models, Neural Networks
PDF Full Text Request
Related items