Research On Text Named Entity Recognition In Security Field Based On Transfer Learning And BERT

Posted on:2023-08-18

Degree:Master

Type:Thesis

Country:China

Candidate:W F Xiong

Full Text:PDF

GTID:2531307058999409

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

As a fundamental task in natural language processing,named entity recognition is a hot topic of research in fields such as finance,medicine and biology,and also plays an important role in information extraction,machine translation,semantic analysis and other work.This thesis aims at the application requirements of entity recognition for retrieved texts in intelligent search engines in the security industry,and investigates how to identify related entities in the security field from the search content input by users.The main work of this thesis is as follows:1.Aiming at the lack of relevant tagging corpus resources in the security field,a complete and practical training corpus generation scheme is designed.Based on extracting the abstract user expression sentence templates and collecting the data sets of the required recognition entities,the training corpus is generated by means of slot-filling,and the information retrieval corpus in the security field is constructed.2.To address the characteristics of a wide variety of entities in the security field,and the difficulty of finding generic features for complex user expressions,a highperformance named entity recognition model TBBC(Tradaboost-BERT-BiLSTMCRF)is designed.Based on the BERT word vector,the model uses BiLSTM and CRF for feature extraction,and at the same time,combined with the Tradaboost update strategy to reduce the difference between the self-built corpus and the real user input dataset during model training,achieving high recognition accuracy in the absence of sufficient real data samples.3.Model super-parameter control experiments,multi-model control experiments and ablation experiments were carried out.The super-parameter control experiments find the appropriate super-parameters for the model;the multi-model control experiments compare the performance of different deep learning models in named entity recognition tasks in the security field;and the ablation experiments verifie the effectiveness of pre-training and fine-tuning mode,the effectiveness of the self-built corpus and the effectiveness of transfer learning updating strategy.The experiments show that compared with the ALBERT-BiLSTM-CRF model,which currently performs well in named entity recognition tasks,the TBBC model improves the accuracy,recall rate,and F1 value by 9.8%,9.7% and 9.8%,respectively.4.The constructed TBBC model is applied to practical engineering projects,and a named entity recognition service for security field text is designed and implemented,which can quickly and accurately identify security field entities from the retrieved text input by users,and standardize the identified time and address entities.

Keywords/Search Tags:

Named Entity Recognition, BERT, Bidirectional Long and Short Term Memory Network, Conditional Random Field, Transfer Learning

PDF Full Text Request

Related items

1	A Method For Predicting Well Logs Using Bi-LSTM Based On Correlation Analysis
2	CNC System Field Technical Term Recognition Based On Deep Transfer Learning
3	Application Of Raman And Near-infrared Spectra Combined With Machine Learning In Pattern Recognition Of Drug
4	Research On Entity Extraction For Animal Food Safety Hazards
5	Design And Implementation Of Decision Support System For Iron And Steel Enterprises Based On Machine Learning
6	Research On Entity Recognition And Relation Extraction Method Of Civil Aviation Emergency
7	Research And Application For Named Entity Recognition Of Coal Mine Accident Field Based On Deep Learning
8	The Research On Soft Measurement Of Free Calcium Content In Cement Clinker Based On Long Short-term Memory Network And Attention Mechanism
9	Research Of Air Quality Prediction Based On Spatiotemporal Deep Neural Networks
10	Research Of Water Quality Prediction Method Based On Attention Mechanism And Long Short Term Memory Neural Network