Font Size: a A A

Research On Chinese Electronic Medical Record Named Entity Recognition Based On BERT Embedding And Residual Connection

Posted on:2020-08-31Degree:MasterType:Thesis
Country:ChinaCandidate:J DingFull Text:PDF
GTID:2404330623459098Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of IT technology and medical informatization,Electronic Medical Records(EMR)has gradually replaced paper medical records and has been used in many hospitals,becoming the core of hospital information systems.In the modern management of hospitals,electronic medical records not only play a role of high efficiency and convenience,but also the first-hand information of scientific research and medical treatment.However,due to the large scale and complexity of electronic medical record medical data,the effective information in medical texts has not been fully mined.Therefore,Named Entity Recognition(NER),one of the natural language processing technologies,was introduced.Named entity recognition has been widely used in the fields of information extraction,intelligent question answering,syntax analysis,and machine translation.It has become the focus of attention of people in various fields.Although named entity recognition has been developed for a long time,the effect of its recognition has still not met the needs in some areas.It manifests itself that traditional named entity recognition methods rely too much on artificial features and require high labor and time costs.In view of this,this article introduces the popular deep learning sequence labeling model,BiLSTM-CRF model,as a benchmark model,and improves it based on this model,so as to better apply it in Chinese electronic medical record named entity recognition.The research content of this article is mainly focused on the following three aspects:(1)In view of the serious lack of public Chinese electronic medical record data and the scarcity of high-quality Chinese electronic medical record labeling data in China,the current conventional algorithm models cannot have a good recognition effect,and the traditional word vector is expressed as a single word vector mapping It cannot characterize the problems caused by the ambiguity of words.Based on the BERT(Bidirectional Encoder Representations from Transformers)pre-trained language model,combined with the BiLSTM-CRF benchmark model,a Chinese electronic medical record named entity recognition model based on BERTBiLSTM-CRF is proposed.Features are added to the network.Comparative experiments prove that the model can effectively enhance the semantic representation of words and achieve better recognition results in the case of small-scale tagging of corpora.(2)Combining pre-training and iterative expansion convolution,a Chinese electronic medical record named entity recognition model based on BBIC is proposed.The algorithm model can not only focus on global features but also take into account local features for the vector passed by BERT.For expanded convolution,the dilated width will increase exponentially with the number of layers,and the number of parameters will increase linearly.The field is increased exponentially,so that it can quickly cover all input data.Experiments show that combined with pre-training in the case of small-scale labeled corpora,the improved model can more accurately obtain text feature information and further improve the recognition effect.(3)The residual connection is introduced into the BBIC model to solve the problem of neural network degradation that occurs when the network model is stacked in multiple layers in order to increase the representation ability.Further optimization of the entire model brings improvement to the recognition effect of Chinese electronic medical records.Experimental results show that the proposed model can bring better Chinese entity name recognition of electronic medical records.
Keywords/Search Tags:chinese electronic medical record, named entity recognition, bert, iterated dilated convolution, residual connection
PDF Full Text Request
Related items