| This research discusses the implementation of classification via clustering to predict the Geometric Mean Length of Stay (GMLOS) for heart failure patients (HF). Classification via clustering approach aims at incorporating the cluster label produced by a clustering algorithm into the original data to improve the performance of the individual classification models. The data used in this research was retrieved from the SPARCS, New York Department of health repository for inpatients and outpatients de-identified information. This research is carried out in the following steps: (1) data collection, (2) data pre-processing, (3) splitting data into training and testing sets, (4) applying k-Means clustering algorithm on training set, (5) training classification models with and without using cluster label, (6) predicting the class of the testing set, and (10) comparing the performance of classification models with and without cluster labels. The outcome variable LOS was coded as a binary variable, by comparing the patient's actual LOS with the DRG based GMLOS value such that each LOS value is converted to "High" if it is higher than or equal to GMLOS and "Low" otherwise. The classification model used in this research are: DT, NB, KNN, SVM, and LR. The classification accuracy of DT, KNN and LR were 0.902, 0.832 and 0.863 respectively; On applying the proposed approach the accuracy improved by 0.006, 0.005 and 0.006. Also, the training CPU time was reduced significantly by 53, 34, 83 and 50 minutes respectively for 1000 runs on DT, KNN, SVM and LR respectively. The outcome of this research can help in classifying the patient with high LOS from those with low LOS with high accuracy (> 85%), which could be used to find potential reimbursements. The use of metric GMLOS to compare with LOS and classify patients, can help in providing a rough idea if each patient will get reimbursed completely or not by Medicare, since Medicare uses the same metric along with other criteria (Case Mix Index, DRG weights) as requirements to fulfill in order to provide reimbursements to patients. |