Font Size: a A A

Research On Named Entity Recognition Based On Graph Attention Network

Posted on:2022-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:J Y LuoFull Text:PDF
GTID:2518306731977969Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Named entity recognition technology is one of the sequence annotation tasks in natural language processing.The specific task is to recognize the words with specific meaning in the text to be processed.These words generally include place names,organization names,person names,etc.This task is also a basic task of information extraction,machine translation,public opinion monitoring and other tasks.Dictionary based or rule-based methods are the traditional methods of this technology,But these two methods are not only time-consuming,but also rely on linguistic features,and have poor scalability.With the development of deep learning technology,it has become the main research method in this field.English named entity recognition research has been relatively mature,but in Chinese corpus,named entity recognition faces some problems: first,the entity recognition error caused by automatic word segmentation error;second,Chinese semantic rich,easy to produce word ambiguity,polysemy and other phenomena,and then interfere with recognition;third,some informal text sentences are short,the text is not standardized,and will be added It is difficult to recognize.In order to reduce the impact of these problems and prepare for the subsequent natural language processing tasks,this paper proposes a graph attention network model LGATC based on graph attention network and a graph attention network model BLGATC integrated with Bert.The research contents of this paper are summarized as follows(1)This paper proposes a Chinese oriented named entity recognition model lgatc based on graph attention network.In this model,firstly,the word vector fusion is used as the input,that is,the word information is used at the same time as the word information,so as to reduce the errors caused by word segmentation to a certain extent;secondly,the global features are captured in the bidirectional long-term and short-term memory network;thirdly,the model is applied in the graph attention network layer by using the sentence dependency In the syntactic dependency graph of a sentence,it pays attention to the main information by updating the attention of neighbor nodes;finally,it selects the optimal label through conditional random field and outputs it.After verification on Chinese open data set MSRA,the F value of LGATC model with improved input and graph attention network is 3.12% higher than that of baseline model Bi LSTM-CRF.(2)Based on the LGATC model,in order to overcome the defects of Word2 vec,capture more semantic information in the input layer and reduce the errors caused by polysemy,the static embedding layer is changed to the dynamic Bert layer to generate a new model BLGATC.When training,Bert is based on context,it does not map a single word to a unique vector,so it can better disambiguate.After verification on MSRA data set,the F value of BLGATC model is 3.54% higher than that of LGATC model.In addition,by comparing BLGATC model with other five Chinese named entity models proposed in recent years on three open Chinese datasets: Onto Notes,MSRA and Weibo NER,it is proved that BLGATC model integrated with Bert has achieved competitive results.
Keywords/Search Tags:Named entity recognition, Graph attention network, Dependency parse, BERT
PDF Full Text Request
Related items