Font Size: a A A

Joint Entity And Relation Extraction Based On Neural Network

Posted on:2022-08-08Degree:MasterType:Thesis
Country:ChinaCandidate:C H ChengFull Text:PDF
GTID:2518306740482924Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of network news media and we media industry,massive and complex news content has been produced on the Internet.Efficient association and organization of entities and relations plays a key role in news governance.Therefore,it is of great value to extract structured information from news content by using entity relation extraction.The existing relation extraction methods rely on entity annotation task,but entity annotation is timeconsuming,labor-consuming and costly.It is particularly important to study the task of entity and relation extraction when a few of entities are annotated or even when no entity is annotated.However,current entity relation extraction still faces two problems.Firstly,in a few of entity annotation scenarios,the existing entity relation extraction is not sufficient to extract text features,and relation classification only focuses on the similarity of text features,which leads to the poor effect of relation extraction.Secondly,in the case of no entity annotation,the existing entity relation extraction is not accurate in the boundary recognition of entities,and the classification of relationship uses less entity information,which leads to a high error rate in the case of entity nesting and relation overlapping.In order to solve the above problem,this thesis proposes a model named Few-shot Relation Extraction based on Text Structure and Entity Pair(Fs Re-TSEP)and another model named Joint Entity and Relation Extraction based on Word Semantic and Handshaking Tagger(JEREWSHT).Then this thesis uses the aforementioned model to design and implement joint entity and relation extraction prototype system.The main research work in this thesis is as follows:(1)Aiming at the problem that entity relation extraction is difficult to extract text features sufficiently and the measure method of text features is simple,this thesis proposes a model named Few-shot Relation Extraction based on Text Structure and Entity Pair.Firstly,Fs ReTSEP model proposed in this thesis vectorizes the sentences in the character dimension and word dimension.Secondly,the text structure features are extracted using the biaffine mechanism and graph convolution neural network.Meanwhile,the contextual features of the sentence are incorporated in the word vector of entity pairs.Then the text structure and entity pair features are used as sentence features.Finally,Fs Re-TSEP model uses a similaritydifference relation network to extract similarities and differences between sentence features to predict entity pairs' relation,avoiding the problem that the measure method is simple.The result of the comparison experiments conducted on public datasets shows that Fs Re-TSEP model outperforms existing models,and the result of the ablation experiments indicates the effectiveness of Fs Re-TSEP model.(2)Focussing on the problem that entity relation extraction is difficult to accurately identify entity boundaries and deal with entity nesting and relation overlapping,this thesis proposes a model named Joint Entity and Relation Extraction based on Word Semantic and Handshaking Tagger.JERE-WSHT model proposed in this thesis uses a pre-trained language model ERNIE and word vectors of CWV to vectorize sentences,and encodes the location information of words using the relative distance of the head and tail characters of words.Meanwhile,JERE-WSHT model uses handshaking tagging scheme to tag entities and relations in the sentences,and uses vector values to replace the tag of head and tail characters within entity to introduce entity type features.Finally,all relational triples in the sentence are outputed by decoding the handshaking tagging matrix.The result of the comparison experiments conducted on public datasets shows that the JERE-WSHT model outperforms classical models,and the result of the ablation experiments indicates the design of the JERE-WSHT model is reasonable.(3)This thesis designs and implements a prototype system for joint entity and relation extraction in the field of news.The prototype system uses JERE-WSHT model to extract relational triples from news texts.Moreover,Fs Re-TSEP model is modified to accurately check the extracted entity relations on a small number of texts.In addition,the prototype system uses Uniform Content Label(UCL)and handshaking tagging matrix to efficiently manage news pages,text sentences,entities and relations.Finally,the relational triples outputed from the prototype system are presented in the form of a diagram through front-end visualization techniques.The prototype system verifies the feasibility of the proposed Fs Re-TSEP model and JERE-WSHT model in the field of news governance.
Keywords/Search Tags:joint entity and relation extraction, few-shot learning, handshaking tagging scheme, uniform content label
PDF Full Text Request
Related items