Predicting Inpatient Length of Stay in Western New York Health Service Area Using Machine Learning Algorithm

Posted on:2018-07-07

Degree:M.S

Type:Thesis

University:State University of New York at Binghamton

Candidate:Salah, Haya

Full Text:PDF

GTID:2474390020457474

Subject:Industrial Engineering

Abstract/Summary:

The main purpose of this thesis is to compare and analyze different classification models to predict the length of stay (LOS) for the population of Western NY health service area. Twelve classification models were used in this thesis including individual classifiers, ensemble methods, and deep learning. Based on the literature review conducted in this thesis, this is the first attempt to examine the performance of deep learning methods in LOS prediction. The data used for this research has been obtained from health.data.ny.gov website. This data contains basic record level details regarding the discharge of inpatients in the State of New York in different health service areas in 2012. This data contains information such as age, gender, race, health service area, facility ID, diagnosis, patient disposition, length of stay, payment methods, etc. In this thesis, the records of inpatient in the Western NY health service area were just considered. The methodology implemented in this thesis consists of three major parts: data preprocessing, training the prediction models and evaluating the performance of the classification models. In data preprocessing, four steps were performed: (1) treating the missing values, (2) binning the class (LOS) into three classes i.e. low, medium and high to apply classification models into the data, (3) conducting a correlation test to eliminate the redundant features and (4) performing feature selection to identify the most significant features which are related to LOS using two filters techniques, namely Chi-square (chi2) and Mutual Information (MI).;The last step in data pre-processing was the transformation of categorical variables into dummy variables which was performed in two steps. First, SPSS was used to transform the categorical values into ordinal numbers based on the levels of each variable and second, the "OneHotEncoder" in Python was used to transfer the ordinal numbers into dummy variables. After the data pre-processing step was conducted, the data was divided into two sets: training set (70%) and testing set (30%). The models were trained on the training dataset and tested on the tested data set.;Based on the experimental results, using the feature selected by chi 2-test results in a higher training performance compared to features selected by MI. The performance of these models was compared based on the confusion matrix, accuracy, precision, recall, and F1-score. The deep learning method achieved the highest prediction accuracy, precision, recall, and F1-score of 88.5%, 88%, 89%, and 88% respectively on the testing data set compared to the other classification models. Based on these results, it can be concluded that deep learning models have a good potential in predicting patient's LOS.

Keywords/Search Tags:

Health service area, Models, LOS, Deep learning, Length, Data, Thesis, Western

Related items

1	A Study Of Deep Learning Based Internet Health Data Analysis Technology
2	Application Of Deep Learning In Human Health Monitoring System Based On Body Surface Temperature
3	Deep Learning Model Management For Coronary Heart Disease Early Warning Research
4	Study On Functions Of Health Care In The Community Health Service Canter Of The Key Contact Cities In Western China
5	Application Of Deep Learning In Electronic Health Records Data
6	The Study On Multimodal Neurobiological Data Analysis And Brain Disease Recognition Based On Deep Learning
7	Research On Big Data Deep Learning For Health Diagnosis And Treatment
8	Investigation On Health Service In Rural Pastoral Area Of Inner Mongolia
9	A Deep Learning Method For Automatically Delineating The Target Area Of Radiotherapy For Brain Tumors
10	Research On AD Diagnosis Model Based On Deep Learning Of Multi-modal Data