Research On Animal Science Domain Named Entity Recognition Based On BERT Pre-training Model

Posted on:2023-12-15

Degree:Master

Type:Thesis

Country:China

Candidate:F H Yang

Full Text:PDF

GTID:2543306803462724

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the promotion of "new agricultural science" construction and the development of agricultural information technology,animal science profession has been developing rapidly Many animal science workers ask questions and acquire knowledge through the Internet.Named entity recognition is a core fundamental technology in the field of natural language processing,which can identify entities and obtain useful information from various kinds of unstructured question and answer data,and then build question and answer systems,knowledge graphs and other applications for workers in the animal science field.Although named entity recognition has been applied in many fields of Chinese,it ignores that many Chinese characters have multiple meanings,and the word vectors obtained by traditional word embedding techniques cannot show such multiple meanings of words.In addition,the development of named entity recognition in animal science is slow due to the strong specialization of the field and the lack of annotation data required for entity recognition.In this paper,we create a corpus of animal science domain and construct a new entity recognition model to be applied to this corpus.The main research contents are as follows.(1)The base text of the corpus is composed of Chinese literature related to animal science domain which is obtained from the Internet.After pre-processing and cleaning the base text,the corpus is annotated with the "BIO"(B-begin,I-inside,O-outside)annotation model by using a corpus annotation tool.Thus we build a corpus of animal science domain.(2)Based on the BERT pre-training model,the commonly used LSTM-CRF named entity recognition model is improved by introducing a bidirectional long and short-term memory network.We construct a BERT-Bi LSTM-CRF model based on the BERT pre-training model.It firstly uses the BERT pre-training model to obtain a word vector representation with contextual semantic information to effectively solve the problem of multiple meanings of a word.Then the word vector representation is input to the bidirectional long and short-term memory network layer for context encoding to improve the recognition accuracy.Finally the optimal recognition effect is obtained through the conditional random field.(3)The model is experimented on the created corpus of animal science domain and compared with RNN-CRF,LSTM-CRF,Bi LSTM-CRF and BERT-CRF models.The results show that the proposed model outperforms the other models in terms of accuracy,recall and F1-score of entity recognition,proving the effectiveness of the model.

Keywords/Search Tags:

Named Entity Recognition, Animal Science Domain, Bi-directional LSTM, BERT, Conditional Random Field

PDF Full Text Request

Related items

1	Research On Named Entity Recognition For Agriculture
2	Research On Agricultural Named Entity Recognition Method Based On XLNet
3	Research On Named Entity Recognition For Diagnosis And Prevention Of Aquatic Animal Diseases
4	Named Entity Recognition Of Tobacco Pests By Integrating GCN And BERT
5	Research On Entity Recognition And Relation Extraction In The Field Of Tomato Pests And Diseases
6	Research On Entity Recognition Of Plant Diseases And Pests Oriented To Partial Tag Data
7	Research On Agricultural Named Entity Recognition Based On Deep Learning Method
8	Research And Application Of Named Entity Recognition And Knowledge Graph Construction For Pig Disease
9	Research On Recognition Of Navigation Scenarios For Agricultural Robot Based On Conditional Random Field
10	Research On Agricultural Geological Named Entity Recognition Method Based On Deep Learning