Font Size: a A A

Research Of ICU Patients Mortality Prediction Based On Feature Extraction

Posted on:2020-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y XuFull Text:PDF
GTID:2404330575452474Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the rapid growth of medical data supply,the prognostic system has become more and more complex and accurate,especially in intensive care units(ICU).ICU is a gathering place for critically ill patients.The typical clinical ICU prognosis systems currently use physiological and demographic methods.These systems are mainly used for risk adjustment without paying too much attention to the patients' disease progression predictions.However,over the past decade,there has been a significant increase in interest in the medical data,and there is an increasing demand for predicting the condition of a particular patient with large data sets.With the rise of big data analysis and machine learning methods,the combination of medical field and machine learning knowledge has become a new solution to the problem of mortality prediction in ICU patients.Currently,most predictive models that use machine learning methods focus on training and improving machine learning algorithms,bug ignoring the analysis and processing of data.Due to the richness and complexity of the monitoring equipment,ICU data usually has problems such as high dimension,uncertain sampling time and frequency,unbalanced categories,and missing data.These data problems will affect the performance of the predictive model.Therefore,the preprocessing and feature extraction of the original data is indispensable.This research will focus on the analysis,feature extraction and screening of ICU data.At the same time,three machine learning methods:decision tree,random forest and XGBoost,will be used to establish the prediction model.We will explore the effects of various feature value combinations on different machine learning algorithm prediction models based on the above methods.The main research content of this paper is divided into the following parts:Firstly,an overall analysis of the ICU dataset used in this study was performed.The article will statistically analyze the distribution of categories in the sample set,calculate the average sampling times,missing rate,and distribution characteristics of each eigenvalue,and analyze the influence of these statistical characteristics on the classification results.We will select the appropriate machine learning method and model evaluation indicators based on the sample characteristics.Secondly,non-time series feature and time series feature are extracted respectively.For the time series features,such as heart rate,blood pressure and respiratory rate,this paper proposes a non-uniform interpolation method with preserving original point and three methods for extracting eigenvalues(diurnal division,difference and feedback coefficient).The eigenvalues are sorted and filtered according to the AUC score of single eigenvalue.Then,for the problem of unbalanced sample set,this paper uses XGBoost to construct the prediction model and design the experiment to compare five resampling methods.The resampling method which best fits the data set of this study is finalized and applied to the subsequent studies.Finally,design experiments to compare five sets of eigenvalues and three machine learning methods.The experiments found that the model prediction results are the best when using all the features obtained by the basic feature extraction method and some features which abs(AUC-0.5)>0.05 from time series,multiple random under sampling methods and XGBoost algorithm.The AUC score on set-B was 0.856,the S1 score was 0.509.The AUC score on set-C was 0.853,the S1 score was 0.516.
Keywords/Search Tags:Intensive Care Unit, Mortality Prediction, Decision Tree, Random Forest, eXtreme Gradient Boosting
PDF Full Text Request
Related items