Font Size: a A A

Research On Entity Relation Extraction For News Texts

Posted on:2021-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:X H ZhouFull Text:PDF
GTID:2428330629951050Subject:Communication and Information Engineering
Abstract/Summary:PDF Full Text Request
The information in news text has great value,so extracting information from it has great practical value.However,the amount of news is large and growing rapidly,and it takes time and effort to extract information by manual processing.Entity recognition and relationship extraction technology can automatically identify news entities,extract entity relationships,and provide possibilities for subsequent in-depth analysis.First of all,news has objective and serious characteristics,so it 's rare to use words with emotional tendencies.Secondly,specific words often appear in news to represent specific organizations.Finally,sentences in news may contain many different types of entities and these entities have different relationships.Types of.in order to deal with the characteristics of news text,this paper constructs a deep learning model to try to identify news entities and extract the relationship between entities from mass news texts,in order to quickly understand the focus of news and speed up the efficiency of information acquisition.This paper proposes an entity recognition a ER-Mul ATT model,which transforms entity recognition into sentence-level sequence labeling tasks.First obtain word vectors from the corpus,and introduce character-level vectors of words to represent the character features of words such as case and abbreviations of words;then use BiLSTM to extract the context-dependent features of words and use the self-attention mechanism to obtain the global correlation of words Features to solve the problem of missing long distance related information;finally,use CRF to obtain the tag sequence.This paper proposes a RE-BiGCN model for entity relationship extraction.This model transforms entity relationship extraction into sentence-level classification problems.First use the word vector,part-of-speech feature vector,entity identification vector and character vector of the word obtained through Char BiLSTM as the model input;combine BiLSTM and BiGCN improved by dependency syntax analysis to extract the local correlation and long-distance features of the word respectively,Construct a fully connected graph with relational weighted edges,select the optimal weighted edge as the optimal path,and then select the most appropriate entity relationship.The experiments show that the experiments on the news corpus of the model constructed in this paper have a performance improvement of 2.97% and 2.48% on the F1 value compared with the non-news corpus respectively;compared with the current mainstream models in the entity recognition and entity relationship extraction experiments In the F1 value,the performance improvement was 5.64% and 7.35% respectively.The significance of this research is that on the one hand,deep learning is used to improve existing models,reduce artificial overhead,improve model performance,and provide possibilities for subsequent analysis;on the other hand,there are many model algorithms for entity recognition and relationship extraction,but there is no specific Custom models in the field of journalism.This article focuses on constructing models in the field of news text,and proves the effectiveness of custom models through comparative experiments.
Keywords/Search Tags:News text, Natural language processing, Entity recognition, Entity relation extraction
PDF Full Text Request
Related items