Research On Entity Relation Extraction Technology For Open Domain Text

Posted on:2020-11-13

Degree:Master

Type:Thesis

Country:China

Candidate:Z K Zhou

Full Text:PDF

GTID:2428330590961148

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Entity relation extraction for open domain text is a natural language processing task with important research value.It is proposed to extract valuable entity relation information from massive texts efficiently and accurately.Distant supervised entity relation extraction methods leverage distant supervised hypothesis to automatically label a large number of sentences and build model on these data,which can effectively avoid the problems of small data scale and strong domain relevance of supervised models.Therefore,distant supervised entity relation extraction model is more suitable for open domain text.However,the sentences in labeled data containing two entities may not necessarily express the relation between the two entities,resulting in the generation of noisy sentences,which brings challenges to distant supervised entity relation extraction models.The piecewise convolutional neural network with sentence-level attention(PCNN+ATT)model is the popular distant supervised relation extraction model at present.Though it assigns attention weights for each sentence to suppress the interference of noisy sentences,it still has two limitations.The first limitation is that PCNN+ATT model adopts a PCNN module as sentence encoder to extract features,which only contain local contextual information and lead to the loss of semantic information.The second limitation is that PCNN+ATT model neglects word-level attention weights,which leads to sentence embeddings not being accurate enough to express the semantics of the sentences.To address these two issues,we propose a hierarchical attention-based bidirectional GRU(HA-BiGRU)neural network model.For the first limitation,HA-BiGRU model utilizes a BiGRU module in place of PCNN,reducing the semantic information loss of the resulting sentence embeddings.For the second limitation,HA-BiGRU model adopts hierarchical attention mechanism,combining word-level and sentence-level attention mechanisms.To further alleviate the noise problem and improve the effectiveness of HA-BiGRU model,we propose two denoising strategies leveraging co-occurrence probabilities(CP)between the shortest dependency path(SDP)of entity pairs in the sentences and the sentences' relation labels: From aspect of data,by setting the CP threshold,we regard the labeled sentences whose relation labels have the CP lower than the threshold as noise,and filter them out,thereby improving the quality of labeled data.From aspect of model,we concatenate CP vectors of relation labels corresponding to SDP(of entity pairs in the sentences)into sentence embeddings encoded via bidirectional GRU in HA-BiGRU model,so as to improve the accuracy of the correlation calculation between sentences and relation labels,assigning more reasonable weights for sentences,which can suppress the influence of noise.To verify the effectiveness of proposed HA-BiGRU model and two denoising strategies,we conduct comparison experiments on Freebase+NYT distant supervised labeled dataset.Experimental results show that our proposed HA-BiGRU model performs better than PCNN+ATT model,and applying the two denoising strategies can effectively reduce noise interference and further improve the effectiveness of HA-BiGRU model.In addition,we validate the effectiveness of hierarchical attention mechanism through case studies.

Keywords/Search Tags:

open domain text, distant supervision, entity relation extraction, HA-BiGRU, co-occurrence probability

PDF Full Text Request

Related items

1	Research On Entity Relation Extraction Method Based On Distant Supervision
2	A Chinese Entity Relation Extraction Method Based On Distant Supervision
3	Research And Implementation Of Entity Relation Extraction Algorithm In News Field Based On Distant Supervision And Seouence Labeling
4	Research On Distant Supervision For Relation Extraction Based On Entity Type Information
5	Entity Relation Extraction For Open Domain Text
6	Research On Entity Relation Extraction In Web Contents
7	Entity Relation Extraction Technology Research On Network Text Information
8	Research And Implementation Of Chinese Text Entity Relation Extraction Based On Deep Learning
9	Distant Supervised Entity Relation Extraction Method And Application Based On Internal And External Semantic Features And Preferential Attention Mechanism
10	Research On Key Technologies For Entity Relation Extraction