Font Size: a A A

Research On Chinese Named Entity Recognition Model Based On Deep Learning

Posted on:2024-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:R S YangFull Text:PDF
GTID:2568307076973209Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The named entity recognition task is an important and fundamental task in natural language processing tasks,and its task is mainly to annotate and recognize named entities in text.The task has important applications in machine translation,sentiment analysis,intelligent question and answer systems and intelligent search,etc.The accuracy of named entity recognition directly affects the working effect of subsequent tasks.The main difficulty of the named entity recognition task lies in the problems of word splitting,ambiguity,word nesting and complex structure of utterances in the text corpus.Deep learning-based named entity recognition is usually implemented as a sequence annotation task.However,some feature extractors have insufficient feature extraction capability to handle both global and local information of text,which hinders the extraction of global and local features.While English sentences have fixed "space" sub-markers,Chinese sentences are word-based,so although word-based feature representation alleviates the character-based feature of Chinese utterances,it cannot represent the multiple meanings and global semantic information contained in Chinese characters themselves,and cannot solve the problem of " The problem of "one word with multiple meanings" cannot be solved.In addition,the deep learning-based approach requires a large number of parameters,which cannot effectively capture long-distance features,and the training process cannot fully utilize the parallel performance of GPUs,resulting in inefficient model training time and impacting the execution of subsequent tasks.To address the above problems,this paper proposes the following solutions:(i),this paper proposes a BERT-Bi LSTM-IDCNN-CRF Chinese named entity recognition model.The model uses the already pre-trained BERT model as the input layer as a way to solve the problem of multiple meanings of words in Chinese text.In terms of feature extraction,the IDCNN model is used for feature information extraction,and the Bi LSTM model is added to capture the global feature information in order to make up for the lack of long-range and global feature information,and the information obtained from these two models is fused to obtain the final feature sequence.Finally,the output feature sequences are used as the input of the CRF model to obtain the optimal annotation results of sentence sequences.In the experimental stage,the results of comparing multiple groups of models show that the feature fusion method effectively improves the recognition effect of Chinese named entity recognition.(ii)On the basis of the feature fusion method,in order to solve the computational efficiency problem,the BERT-Star-Transformer-GRU-CRF Chinese named entity recognition model is proposed in this paper.In the input layer,the method also uses the pre-trained BERT model to generate word vectors.In the feature extraction part,in order to solve the problems of high computational overhead and long training time of the traditional Transformer model,this paper cites the lightweight Star-Transformer model as the local feature extraction model and adopts the GRU model with high computational efficiency in parallel.The global features are extracted,and the obtained global and local information features are fused and finally used as the input of the CRF layer to obtain the final optimal annotation results.In the experimental stage,the results of multiple groups of model comparison show that this method effectively improves the recognition effect of Chinese named entity recognition.
Keywords/Search Tags:Natural Language Processing, Named Entity Recognition, BERT, Feature Fusion, Conditional Random Fields
PDF Full Text Request
Related items