Research On Boundary-based Nested Named Entity Recognition Method

Posted on:2021-03-19

Degree:Master

Type:Thesis

Country:China

Candidate:L F Wu

Full Text:PDF

GTID:2438330623484369

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Named entity recognition(NER)is a foundational research task in Natural Language Processing.The traditional technology adopts shallow sequence label model(such as Hidden Markov,Conditional Random Field)to output a label for each word,indicating the "begin","inside" and "outside" of entity(BIO tag).The sequence model usually outputs a label path with the highest probability by dynamic programming algorithm.Since traditional NER mainly processes independent sentence,the shortcoming of sparse context features is so prominent.In addition,outputting a label path with the highest probability cannot recognize nested named entity(nested NE).In order to recognize nested NE,most of related works employ cascading model: first generate candidate entities,and then classify them.The cascading model can only optimize each stage,and cannot obtain the global optimal.And it is easy to cause cascading failure.In order to solve the problem of sparse features and difficult recognition of nested entities,this paper proposes a deep boundary assemble model(NNBA).Based on the boundary assemble algorithm,NNBA constructes a BERT-based cascade model.First,the model uses sequence label algorithm based on Bi-LSTM-CRF to recognize the start and end boundaries of Nes.Then,candidate NEs are generated by boundary assemble.Finally,a Multi-LSTM model is used for candidate entity discrimination.Because BERT can use external data to automatically obtain sem-antic information and contextdependence,NNBA can effectively overcome the sparse feature problem faced by shallow learning models.The F1 value of ACE2005 Chinese data set reached 90.12%,which exceeded the comparison method by 17%.In view of the shortcomings of the cascading model,this paper proposes an endto-end boundary regression model(BR)based on the NNBA.BR draw on the experience of algorithm idea of Object Detection,and adopts linear sampling algorithm and boundary regression algorithm according to the characteristics of linear text sequence and NER.A multi-objective learning model based on neural network and boundary regression algorithm is constructed in an end-to-end manner.While predicting text border classification labels,it also predicts its position,which can make more effective use of the supervision information in the labeled data.The BR model performed well in the ACE2005 Chinese data set,and the F1 value reached 89.30%.

Keywords/Search Tags:

Nested Named Entity Recognition, Boundary Assemble, Boundary Regression, Information Extraction, Deep Learning

PDF Full Text Request

Related items

1	Research On Nested Named Entity Recognition Based On Knowledge Embedding And Boundary Enhancement
2	Joint Extraction Of Nested Named Entity And Relations Based On Multi-task Learning
3	Research On Nested Named Entity Recognition Algorithm Based On Deep Learning
4	The Research Of Weibo Entity Recognition Model Based On Active Learning
5	Candidate Region Aware Nested Named Entity Recognition
6	Chinese Nested Named Entity Recognition Research
7	The Method Of Nested Named Entity Recognition In Microblog
8	Design And Implementation Of Information Extraction Based On Deep Learning
9	Research On Named Entity Recognition And Relation Extraction Between Entities Based On Depth Learning
10	Chinese Nested Named Entity Recognition And Relation Extraction