Font Size: a A A

Research On Chinese Entity Recognition Based On BERT

Posted on:2022-08-23Degree:MasterType:Thesis
Country:ChinaCandidate:J Y BaoFull Text:PDF
GTID:2518306521982019Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Named entity recognition is a key technology of natural language processing,which can recognize entities in unstructured text data.In fact,this technology can be applied to downstream tasks of natural language processing to improve the completion effect of downstream tasks.The effect of named entity recognition will directly affect the effect of many downstream tasks of natural language.This article uses the deep learning method,takes the BLSTM-CRF model as the benchmark model,introduces the BERT language model into the model,and adds the loss against imbalanced classification results into the loss function.From this,an improved BERT-BLSTM-CRF model is constructed.First of all,the BERT model,which has played a breakthrough role in the field of natural language processing in recent years,is introduced into the entity recognition model as the word vector model.It solves the problem that the previous word vector model cannot dynamically express that polysemous words should have different semantics in different contexts,and enriches the representation of contextual semantics in word vectors.Secondly,the CNN network layer is introduced into the named entity recognition model to process and obtain the spatial characteristics of the network input information.Finally,the Focal Loss of image classification is introduced into the loss function of named entity recognition to alleviate the problem of imbalanced entity label classification in sequence labeling.This article uses the public data set "People's Daily" to test and compare the named entity recognition algorithm model,and build a Wikipedia Chinese data set containing the name of the person,organization,and position entity to verify and compare the named entity recognition algorithm model,And also prepare data for the follow-up information monitoring knowledge map.The improved algorithm was tested on the People's Daily data set and the Wikipedia Chinese data set,and the results showed that the F1 value increased in each entity category.
Keywords/Search Tags:named entity recognition, deep learning, BERT, BLSTM-CRF
PDF Full Text Request
Related items