Font Size: a A A

Study On Key Technology Of Chinese Electronic Medical Records Information Extraction

Posted on:2018-08-17Degree:MasterType:Thesis
Country:ChinaCandidate:N YuFull Text:PDF
GTID:2348330563952496Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Digital medical and medical informatization has become an important research content in the field of medicine,the research of electronic medical records has been widely applied and rapidly developed in recent years.Before the application of structured electronic medical records,hospitals have preserved a large number of unstructured electronic medical records,which had important significance for research and clinical practice,but was not convenient for efficient information retrieval,a large number of medical information cannot be effectively used,resulting in a serious waste of resources.Therefore,it is very important to study the information mining of electronic medical records,which is also one of the most important tasks and challenges in modern medical treatment.Electronic medical records information extraction provides the technical support for the information mining and analysis of unstructured electronic medical records.Due to the late start of the study in China,the natural language expression is more flexible in the medical records,and there are many technical terms,it is difficult to carry out the information extraction of electronic medical records.The basic and important process of information extraction is named entity recognition,which is an important part of this paper.On the basis of named entity recognition,the paper studied entity relation extraction of medical records.Finally,the paper summarized and prospected the research work.The main contents of this paper are as follows:(1)According to the characteristics of unstructured text in electronic medical records,a named entity recognition method based on multi-feature fusion conditional random fields is proposed.We selected a total of 600 medical records from a third-grade hospital as the experimental data,and randomly selected 400 cases as the training set,the remaining 200 cases were used as the testing set.The conditional random fields model features were divided into basic and advanced features,by selecting different features and feature templates,the experimental parameters and the best feature combination were determined,the final identification of disease,symptom and surgery entities of electronic medical records had achieved good results.(2)Consider that there is not a large,open and comprehensive Chinese electronic medical records corpus,a semi-supervised named entity recognition method is proposed,using bootstrapping algorithm and combining with the advantages of maximum entropy model to improve it.The method allows identifying named entities in electronic medical records by using only a small number of seed words,continuous learning and the optimization of the maximum entropy model.Through several experiments,the optimal experimental parameters were determined,by comparing the named entity recognition results with other models,it is found that this method could effectively improve the effect of named entity recognition of electronic medical records.(3)On the basis of the named entity recognition of electronic medical records,the relationship among diseases,symptoms and surgeries was extracted.Consider that the sentence structure and the description way of Chinese electronic medical records are similar,a method for entity relation extraction of medical records based on convolution tree kernel is proposed.We pre-treated the medical records,and transformed into the syntax tree and marked,then constructed SVM multi classifier by "one-versus-one" approach,on this basis,the experiment was carried out respectively by the method of the subtree kernel and subset tree kernel.The results of contrast experiments showed that the method based on subset tree kernel had better performance on entity relation extraction of medical records than the subtree kernel.Study on key technology of Chinese electronic medical records information extraction,not only provides a good way for the data mining,statistics and analysis of medical information,but also provides an effective method and idea for the conversion of unstructured electronic medical records to structured electronic medical records.
Keywords/Search Tags:electronic medical records, information extraction, conditional random fields, bootstrapping method, convolution tree kernel
PDF Full Text Request
Related items