Font Size: a A A

Named Entity Recognition Algorithm Based On BERT And Semantic Relevance

Posted on:2022-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:Y X YuanFull Text:PDF
GTID:2558307154476854Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the advent of the era of big data,the information is getting more and more abundant on the Internet.Faced with the increasing unstructured text data,how to obtain vital information from massive text data has become a research hotspot and difficulty at home and abroad.As one of the basic task,named entity recognition has been paid more and more attention by researchers at home and abroad.In recent years,many neural network models have been proposed and widely applied in named entity recognition tasks.However,there are still some problems: the traditional static word embedding technology cannot solve the problem of polysemy;feature learning for long text sequences may cause semantic dilution;high-quality named entity recognition models tend to rely on a large amount of high-quality labeled data;moreover,there are many difficulties in obtaining large-scale labeled data.Based on the above problems,this paper takes the mainstream BILSTM-CRF as the benchmark model,and improves and optimizes it.The main contents of this thesis are as follows:This paper propose a entity recognition model fused with BERT and global attention mechanism to solve the problems of polysemous words in natural language and semantic dilution in long text.Firstly,this paper introduce the BERT for word embedding,which consists of multi-layer Transformer coding units.This model can learn the contextual semantic information of each word in a text sequence,and then dynamically generate word vectors to provide high-quality word vectors for downstream tasks.Moreover,the global attention mechanism is used to strengthen the role of key information in text,solve the problem of semantic dilution,and achieve effective recognition of the entity better.Furthermore,this paper introduce the concept of "entity combiner" through the semantic relevance of text information.The entity combiner,which is highly correlated with entities,can improve the entity’s digital representation ability and improve model performance.Based on this,a sub-model of entity combiner recognition is proposed,which realizes the entity combiner recognition in the unlabeled text by connecting text with structure information.Due to the strong correlation between entity combiner and entity,the entity recognition model fused with BERT and global attention mechanism constructs the constraint mechanism for entity representation and sentence representation.This model can ensure the guiding function of entity combiner in the learning process,and realize accurate and efficient entity recognition in text data.This paper have done a lot of experiments on the open datasets CONLL03 and NCBI Disease,and compared the proposed algorithm with other advanced algorithms from different perspectives.The results demonstrate the feasibility.and effectiveness of the proposed method.
Keywords/Search Tags:Named entity recognition, Semantic relevance, Entity combiner, BERT, Attention Mechanism
PDF Full Text Request
Related items