Font Size: a A A

Building and Evaluating Predictive Model Classifiers for Percutaneous Coronary Intervention (PCI) Procedure Composite Outcome using Pre-procedure Patient Data

Posted on:2015-07-24Degree:M.SType:Thesis
University:University of California, DavisCandidate:Patel, Tanuj GFull Text:PDF
GTID:2474390017995584Subject:Medicine
Abstract/Summary:
Background: With availability and accessibility of ever-growing data combined with technological advances in healthcare, healthcare organizations and providers are recognizing the use of predictive data mining techniques to make wide-ranging decisions based on historical data. This research assesses the process of predictive modelling by using real-world PCI procedure data from six pilot hospitals participating in the Percutaneous Coronary Intervention California Audit Monitored Pilot with Offsite Surgery (PCI-CAMPOS) program. The proposed model would use only pre-procedure patient data to find similarities between patients and their clinical characteristics, in order to train and in turn predict the risk of having post PCI procedure composite outcome, which includes post-procedure stroke and/or post-procedure emergency CABG surgery and/or in-hospital death.;Methods: There were important and methodological predictive modelling steps carried out in the process. First, data was prepared and pre-processed to derive a class attribute and other supplementary attributes from existing data. After selecting only pre-procedure attributes, different data pre-processing and data level class balancing techniques were applied to the original imbalanced data set and a total of 18 data sets were created. After dividing all 18 data sets into training and testing sets, variations of k-nearest neighbor (k-NN) algorithms under instance based learning (IBL) methods in WEKA were used for training. Algorithmic approach to balance the class attribute was also tried using cost sensitive classifiers. The process resulted in 126 training models which were subsequently evaluated on corresponding testing sets using the same k-NN algorithms. All of the training and testing sets were assessed for sensitivity, specificity, area under the receiver operating curve (AUC), overall accuracy, positive predictive value, and negative predictive value.;Results: Upon training, 55 models out of a total 126 were able to achieve . 80% sensitivity and 113 models achieved ≥ 80% specificity. All 55 models with . 80% sensitivity had at least one data pre-processing or class balancing technique applied to them. However, upon evaluating 126 models on testing data, only 8 models achieved 80% sensitivity and those 8 models were unable to reach specificity greater than 70%. These 8 models' AUC ranged from 0.667 to 0.844. Potential models had varying overall accuracy ranging from 42.7% to 71.2%. Although all potential models had a negative predictive value of ≥ 99%, none of them attained a positive predictive value (precision) of ≥ 10%. Out of 8 potential models: all eight were with under-sampling techniques applied to them; five of them were with feature selection and five were with discretization. Three potential models achieved the sensitivity goal by training with the 3NN cost sensitive algorithm; two potential models were able to utilize the IB1 algorithm; one model used the 5NN cost sensitive algorithm and one potential model used average probability of IB1, 3NN, and 5NN in one ensemble algorithm to reach the goal.;Conclusion: From the experimentation, it is concluded that a historical data set can be used to construct models which can make effective predictions by leveraging similarities between patients and their clinical characteristics. However, in order to achieve 80% sensitivity and 80% specificity of a predictive model for composite outcome predictions using only pre PCI procedure data, an additional pool of data may be required in order to conduct more training, exhaustive testing and validations. Moreover, experimentation with complementary data mining techniques and various machine learning algorithms could also provide additional benefits.
Keywords/Search Tags:Data, Predictive, PCI, Composite outcome, Using, Model, Class, Testing
Related items