Font Size: a A A

Chinese Entity Recognition And Relation Extraction Method Based On Deep Learning

Posted on:2022-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y M ZhuFull Text:PDF
GTID:2518306476990839Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
In the current information age,in the face of massive images,texts,audio and video and other forms of information,how to quickly and accurately obtain the information needed for various tasks during processing is the mainstream research direction in the information field.Among them,the study of text data,that is,in natural language processing,information extraction has become a key research task because of its processing of the most basic elements of text.There are three subtasks in information extraction,named entity recognition,relation extraction and event extraction.Among them,entity recognition and relation extraction are the initial tasks of many complex natural language processing tasks,and their results affect downstream tasks a lot,so the research on these two tasks is of great significance.In this thesis,two commonly used methods—pipeline and joint extraction are used to construct extraction models respectively to realize entity relation extraction and achieved good results.Firstly,this thesis designs the BERT-Lattice-CRF model based on the idea of the pipeline model to realize the Chinese named entity recognition.Word vector with rich semantic information is obtained through the BERT pre-training language model,and then sequence encoding is performed by Lattice LSTM,and text word features are merged.Finally,CRF is used for sequence decoding to obtain the predicted entity label result.Accuracy rate of 94.72% and F1-score of 94.72% are obtained on MSRA corpus.This thesis designs the BERT-BiLSTM model to realize the relation extraction.The result of named entity recognition and the BERT output vector are merged in the input layer,and use BiLSTM to encode with context information,and finally use the softmax function to predict entity relationships.Accuracy rate of 74.78% is obtained on the open source character relationship corpus.This thesis designs a BERT-BiLSTM-LSTM joint entity relation extraction model according to the idea of joint learning to consider the connection between the two tasks and obtain the overall optimal training model.The model id divided into three modules.The input module fuses the original word vector,text vector and position vector into BERT to obtain the vector containing full-text semantic information.Enter the entity recognition module.The entity recognition result is obtained after BiLSTM coding and softmax classification.Then the entity label information and BERT encoding information are combined as the input of the relation extraction module,and the relationship category is predicted by LSTM layer,fully-connected layer and softmax,and finally the entity relation extraction result if obtained.Accuracy rate of 76.48% is obtained on the open source character relationship corpus.
Keywords/Search Tags:deep learning, named entity recognition, relation extraction, BERT, LSTM
PDF Full Text Request
Related items