Font Size: a A A

Research On Prediction And Filling Methods For Missing Data Of Electronic Health Records

Posted on:2021-03-06Degree:MasterType:Thesis
Country:ChinaCandidate:X C ChenFull Text:PDF
GTID:2404330602964600Subject:Engineering
Abstract/Summary:PDF Full Text Request
Electronic Health Records(EHR)contain the patient’s personal health-related information such as physical condition,disease information,immune status,and hospitalization records.Using deep learning methods such as neural networks can mine medical laws from a large amount of EHR data.These medical laws help to detect diseases and treat them early.However,due to objective reasons such as untimely data recording and limited data measurement conditions,EHR data often has a large number of missing data items.This situation greatly limits the application of machine learning methods.Therefore,an effective method must be found to deal with missing data items.By analyzing the characteristics of EHR data and existing missing data prediction methods,this paper proposes two missing data prediction methods based on Recurrent Neural Network(RNN).These two methods were used to process the missing data of the MIMIC-III dataset.Then based on the populated data set,we predict the patient mortality during hospitalization to verify the effectiveness of the missing data processing method.The main work of this paper is as follows:(1)This paper proposes a method for predicting and filling missing EHR data based on Long Short-Term Memory Network(LSTM).First,we extract the patient’s physiological data field from the EHR data and mark the missing data items in the data field.At the same time,we mark the corresponding visit data of patients with missing data items and form a new data set.Second,we train the LSTM model based on the new data set,and then use the trained model to predict missing data items.And fill in the missing position with the predicted value according to the mark to form a complete data set.Finally,we use the mean absolute error(MAE)method and the patient’s mortality prediction method during hospitalization to verify the prediction data.Experimental results show that in the mean absolute error method,the MAE value of the prediction data of all fields is reduced below 0.44.This verifies the accuracy of the predicted value of the missing data.At the same time,in the patient’s mortality prediction method during hospitalization,the mortality prediction accuracy reached 94.3%.This verifies the validity of the predicted value of the missing data.(2)This paper proposes a prediction and filling method for EHR missing data combiningAttention and Bi-directional Long Short-Term Memory Network(Attention-BiLSTM).The analysis of EHR data shows that,on the one hand,the patient’s physiological data has a chronological sequence.On the other hand,the individual values of some physiological data of the patient will directly reflect the severity of his disease.In view of the above characteristics,this paper improves the structure of the existing Bidirectional LSTM(BiLSTM).We introduce attention mechanism in the hidden layer of BiLSTM to construct a Bidirectional long short-term memory neural network combined with attention mechanism.And we use this Attention-BiLSTM to predict missing data,and finally fill in the missing data set.Experiments on the MIMIC-III data set show that this method reduces the MAE value of the predicted data for all fields below 0.33.At the same time,the prediction accuracy of the patient’s mortality prediction method during hospitalization was further improved,reaching 95.1%.This shows that the method can better handle missing EHR data and make the prediction results more accurate.
Keywords/Search Tags:EHR, Missing data processing, LSTM, Attention
PDF Full Text Request
Related items