Font Size: a A A

Application of Engineering Principles with a Comparison of Machine Learning Classification Methods to Predict Treatment Outcomes in Head and Neck Cancer Patients

Posted on:2017-03-02Degree:D.EngrType:Dissertation
University:The George Washington UniversityCandidate:Hernandez, Alberto MirandaFull Text:PDF
GTID:1468390014467501Subject:Biomedical engineering
Abstract/Summary:
Our research approach emphasized a comparison of various classification methods ("Decision Trees, Logistic Regression, Naive Bayes, Linear Discriminant Analysis, Nearest Neighbors, Support Vector Machines") and compared those with ensemble classifier models ("bagging" and "boosting") to predict weight loss of five or more kilograms and toxicity of five or more grays above the actual radiation therapy dose received by patients, with data up to 90 days post-treatment. The data for this study was obtained from Johns Hopkins Hospital, Baltimore, MD, taking anonymous data sets from OncospaceRTM database, consisting of randomly selected records of 326 patient instances (rows) and 295 features or predictor variables (columns) out of 729 features available, to predict weight loss. Features included tumor factors, diagnosis, treatment, patients' anonymous biographical data, cancer site, and quality of life surveys ( Appendix A), among others. Toxicity data included 597 patient instances (rows) and 37 predictor variables (columns), including toxicity to various organs and tissue. OncospaceRTM data used was from previously treated patients collected from June 24, 2014, back to January 1, 2006 (sample data fields in Appendix B). Feature variables and models were validated, evaluating predictive performance accuracy with 10-fold cross-validation and expert feature selection (domain knowledge and tools). We built the models using a comprehensive training and testing process available with MathWorksRTM, MatlabRTM, Statistics and Machine Learning Toolbox(TM), Classification Learner Application. Ensemble bagged trees classifiers showed prediction accuracies of 86.1% (toxicity) and 96.3% (weight loss). Ensemble boosted trees showed 92.3% (toxicity) and 100.0% (weight loss). Ensemble methods showed consistently higher prediction accuracies than that of single classifiers.
Keywords/Search Tags:Methods, Weight loss, Classification, Predict, Trees, Toxicity, Ensemble
Related items