Font Size: a A A

Entity Extraction Technology Of Chinese Electronic Medical Record

Posted on:2019-06-16Degree:MasterType:Thesis
Country:ChinaCandidate:J C LinFull Text:PDF
GTID:2428330545482439Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of society and the coming of the information age,the construction of medical information has entered our life.Health information has brought us a lot of medical data,electronic medical record is the product of which,electronic medical record contains a lot of medical information.In order to promote the rapid development of intelligent medical information,data mining of medical information in electronic medical records is particularly important.Therefore,this paper mainly focuses on the research of Chinese electronic medical record named entity extraction technology,and the specific work is as follows:1.Refer to the medical entity information given by I2B2 and CCKS,classify and define the medical entities,and establish a corpus of 2000 scale Chinese electronic medical records named entity tagging.2.Using structured support vector machine and conditional random domain two shallow machine learning methods to study Chinese electronic medical record named entity extraction technology,and introduce case structure feature,dictionary feature and word clustering feature.Due to the lack of Chinese medical dictionaries and knowledge base,this paper set up a certain scale electronic medical record dictionary for auxiliary research.The word vector is constructed in 4522 electronic medical records,and the Brown clustering method is used to cluster the words,and the experimental results of the structured support vector machine model are best after the feature expansion.3.Recurrent neural network(RNN),RNN-CRF and CNN-BLSTM-CRF three deep neural network learning methods to extract the technology of named entity extraction in Chinese electronic medical record.The recurrent neural network model adopts two way long and short memory network structure,which allows the hidden layer to record the preceding historical information,so as to make up for the lack of contextual information.The introduction of conditional random field module can solve the problem of sequence annotation very well,and use the attention mechanism to improve the original character vector and word vector splicing for the sum of weight,so that the model can make use of word vector and character vector information dynamically.The convolutional neural network module is introduced to extract the local features of the text,and the character vector matrix is convoluted and pooled,and then the character level features of each word are obtained,and then theword vectors and character vectors of each word are spliced.The recurrent neural network model gradually introduces the structure of two-way length and short memory network,the attention mechanism and the conditional random field module.Through the analysis of the experimental results,the performance of the model reaches the optimal state.In summary,this paper uses two shallow machine learning methods of structured support vector machine and conditional random domain to study Chinese electronic medical record entity extraction technology.The recognition performance of the model is improved after the basic feature is added to the extended feature.The recurrent neural network method is used to obtain the characters by introducing the convolutional neural network module.Class feature to complement word vector,and the introduction of the bidirectional long short-term memory and conditional random field module are effective ways to improve the performance of model recognition.
Keywords/Search Tags:Electronic Medical Record, Entity, Recurrent Neural Network, Shallow Machine Learning
PDF Full Text Request
Related items