Font Size: a A A

Named Entity Recognition Of Chinese EMR Based On Deep Learning In The Context Of Stroke

Posted on:2021-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:S M WangFull Text:PDF
GTID:2494306104492014Subject:Health information management
Abstract/Summary:PDF Full Text Request
[Objective] This paper aims to analyze the characteristics of Chinese stroke EMR(Electronic Medical Record),formulate the labeling standard of named entity,and construct the EMR named entity corpus of stroke.After that,the task of named entity recognition of Chinese EMR is realized by the methods of deep learning such as pre-trained model,and the best named entity recognition model is determined by comparing the recognition effect of different models.It lays a foundation for the research of the deep mining of the knowledge of EMR and the construction of knowledge graph.[Methods](1)Literature research method: Through a variety of channels to learn about the development of named entity recognition,and study the theory of machine learning and deep learning,especially the application and practice of named entity recognition research.(2)Expert interview: Through consultation with experts in natural language processing and front-line clinical medical staff,to ensure the authority and accuracy of labeling work,and to assist in the promotion of research.(3)Combination of deep learning and machine learning: The Bi-directional long shortterm memory network and conditional random field are combined to realize the recognition of named entities,which improves the rationality and accuracy of the output results while learning the context information of EMR.(4)Pre-trained model: The pre-trained model BERT and ERNIE are used to extract the features of stroke EMR,and the generated word vector by fine-tuning is used as the input of the next layer of neural network.[Results](1)Based on the analysis of the structural and linguistic characteristics of Chinese stroke EMR,five entity categories were determined,and the stroke EMR labeling standard was formulated.After two rounds of pre-labeling and formal labeling,the Chinese stroke EMR named entity language corpus was constructed,which contains 69222 entities,and the consistency of formal labeling is 94.56%.(2)A named entity recognition model based on BiLSTM-CRF was constructed.The labeled corpus was imported into the model for training,and a sequence annotation model for named entity recognition of Chinese EMR of stroke was generated.(3)On the basis of BiLSTM-CRF model,the highest F1-score obtained by the experiment of word2 vec feature extraction is 89.05% and 90.69%,while the highest F1-score obtained by the experiment of pre-trained model BERT and ERNIE fine tuning is 93.51% and 94.18%.[Conclusions](1)There are a large number of named entities in Chinese stroke EMR,which can be effectively identified and extracted by using deep learning method.(2)Under the same conditions,the experimental results of character level text based on word2 vec feature extraction are better than those of word level.(3)Compared with traditional language model,pre-trained models such as BERT and ERNIE can learn more language features,have stronger feature extraction ability,and are equally effective in the highly specialized medical field,do not need to be retrained,and have good applicability.(4)Based on the fine-tuning method of the ERNIE pre-trained model and inputting the extracted features into the BiLSTM-CRF model,the language text features in EMR can be learned more completely,and a good named entity recognition effect can be achieved.This model can be used to further realize information extraction,machine translation,question answering system and other applications.
Keywords/Search Tags:Named Entity Recognition, Chinese EMR, Deep Learning, Pre-trained Model, Stroke
PDF Full Text Request
Related items