Font Size: a A A

Research And Implementation Of Named Entity Recognition Based On Deep Learning

Posted on:2022-05-08Degree:MasterType:Thesis
Country:ChinaCandidate:D H YangFull Text:PDF
GTID:2518306338986579Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Named entity recognition,as one of the key technologies in the field of natural language processing,has a fundamental role in tasks such as information extraction,knowledge question and answer,and machine translation.The main research of named entity recognition is to identify vocabulary or proper nouns with special meaning from unstructured text.Early named entity recognition methods are based on rules and dictionaries.These methods rely heavily on rules formulated by domain experts and have poor portability.Subsequent developments are based on statistical learning methods which constructs and extracts features by manual feature engineering.This process consumes a lot of time and labor costs,and the performance of features is unstable.With the continuous development and application of deep learning in recent years,methods based on deep learning have gradually been applied to named entity recognition tasks.The most representative one is the BiLSTM-CRF model.This paper selects this model as the baseline model for comparison experiments.In addition,compared with English,Chinese named entities have the characteristics of complex entity structure and many types,and due to the characteristics of the Chinese language itself,the identification task is more difficult and more challenging.Aiming at the problems in the task of Chinese named entity recognition,this paper designs and improves the model of named entity recognition based on the deep learning method,and verifies the recognition performance on related datasets.On this basis,a named entity recognition system was designed and developed.The main research work is as follows.This paper first proposes a character-level feature vector representation method based on vocabulary enhancement.This method combines external lexicon to construct different types of word sets,compresses and vectorizes the sets,and integrates vocabulary information into the character vector.It effectively avoids the error propagation problem of the word segmentation task in the word-level method,and effectively integrates the double-character features into the character vector representation.After experimental verification,this method has achieved effective improvement in the accuracy rate,recall rate and F1 value of entity recognition.In addition,in the label decoding layer of the three-layer architecture model of named entity recognition based on deep learning,this paper proposes an entity boundary detection and entity type discrimination algorithm.The underlying implementation of the algorithm is based on a multi-layer perceptron and softmax.After completing the feature extraction of the sequence modeling layer,the features are input into a multi-task framework composed of boundary detection and type discrimination for joint training.After experimental verification on the datasets,the algorithm effectively improves the performance of named entity recognition.Finally,on the basis of the aforementioned two research work,this paper designs and implements a named entity recognition and display system,which can display the internal information and model structure of the dataset used in the experiment.This system can compare and display the recognition performance of each model on the data set in the form of a graph according to the dataset selected by the user.
Keywords/Search Tags:Named Entity Recognition, Bidirectional Long Short Term Memory Network, Pre-trained Language Model, Vocabulary Enhanced, Multi-task Learning
PDF Full Text Request
Related items