Font Size: a A A

Research On Entity Category Labeling Of Chinese Electronic Medical Records Based On Deep Learning

Posted on:2022-08-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y XuFull Text:PDF
GTID:2494306764495634Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Electronic medical record contains a large amount of medical knowledge,and its text processing research has a great significance to promote clinical and scientific research.How to extract effective information from this kind of unstructured text has become the key to the development of medical field.As an important part of the text processing of electronic medical records,entity category labeling can identify the entity categories in the text,which has far-reaching significance for subsequent work such as extraction of inter-entity relationship and construction of medical knowledge map.The granularity of classification of entities included in medical records directly affects the comprehensiveness and effectiveness of the information extracted from medical records.However,there is no authoritative and unified definition of the category of medical entity in China,and most studies classify the category of medical entity too roughly,resulting in incomplete information extracted from medical records.In addition,the specialty and complex structure of electronic medical records also increase the difficulty of entity category labeling.In view of the above difficulties,this paper takes real Chinese electronic medical records as the research object and uses the deep learning method to study entity category labeling.The main work includes the following points:(1)A fine-grained medical entity category definition for Chinese electronic medical records is proposed,which divides the entity categories in the medical records in detail,and constructs an artificial corpus of 1200 real Chinese electronic medical record texts according to the definition,which lays the data for subsequent experiments basis.(2)A CNN-BILSTM-CRF entity class annotation model based on word level is designed.According to the characteristics of electronic medical record text,this model makes full use of the advantages of multiple neural network models to improve the accuracy of entity category labeling: First,the character-level morphological features of the entity were extracted by CNN,then the contextual features of the word level were learned by bidirectional LSTM,and finally,the optimal entity category labeling results were obtained by CRF analysis and adjustment of the output sentence level tag sequence.(3)The pooling layer of the traditional CNN model is improved,and a hybrid pooling method based on maximum pooling and average pooling is proposed.According to the setting of hyperparameter k in the model,k larger elements in the pooling area are selected and calculated.The mean value is used as the pooling result.This method can fuse multiple strong activations in the feature map to obtain the best representation of the features extracted by convolution,and improve the accuracy of the overall entity category labeling of the model.(4)Experiments have verified that the joint model proposed in this paper can effectively deal with the entity category labeling problem in Chinese electronic medical records,and has a higher labeling accuracy rate than other models.In addition,this paper further encapsulates the model,designs and implements a user-friendly Chinese electronic medical record entity category automatic tagging system.After testing,the system can simplify user operations and facilitate the use of models by non-professionals.
Keywords/Search Tags:Chinese electronic medical record, entity category labeling, convolutional neural network, long and short-term memory network
PDF Full Text Request
Related items