Machine Learning Methods with Hierarchical Dat

Posted on:2018-02-25

Degree:M.S

Type:Thesis

University:University of Colorado Denver, Anschutz Medical Campus

Candidate:Roberts, Katherine M

Full Text:PDF

GTID:2478390020457476

Subject:Biostatistics

Abstract/Summary:

General Linear Mixed Modeling (GLMM) has been an established method for classifying and predicting disease outcome in the field of Radiology. This paper provides a comparison of several machine learning methods to analyze hierarchically structured unbalanced dichotomous outcome data. The goal is to determine if the hierarchical structure of the described data makes a difference when choosing one of these methods.;The methods assessed with GLMM were two-way Naive Bayes (NB), Penalized Linear Discriminant Analysis (PDA), and Random Forests (RF). While all methods evaluated the dataset naively (i.e. not taking hierarchy into account), this paper shows an expansion of PDA and RF to include first-level data in the hierarchical structure. Cross-validation methods include 60/40 validation sets (training and testing data partitions) as well as leave-one-out cross-validation (LOOCV).;Data was simulated to investigate the adequacy of these techniques when different correlation (between-subject variance) and sample size parameters are considered. ROC curves with AUC (95% CI), Youden indexes, sensitivities and specificities as well as prediction accuracies were evaluated.;We show no prediction accuracy gain over GLMM for our particular dataset and, while sensitivities and specificities differ across methods, further evaluation on more robust data and additional work to improve and expand the machine learning functions presented here is desirable.

Keywords/Search Tags:

Machine learning, Methods, Data, GLMM, Hierarchical

Related items

1	Research On Hybrid Hierarchical Extreme Learning Machine Algorithm
2	Scalable Sparse Machine Learning Methods for Big Dat
3	Application of Machine Learning and Statistical Learning Methods for Prediction In A Large-Scale Vegetation Ma
4	Research On New Hierarchical Fuzzy Classification Learning Method
5	Geometric methods in machine learning and data mining
6	Research On Indoor Positioning Algorithm Based On Extreme Learning Machine In Incomplete Data Sets
7	Research On Machine Learning Methods That Exploit Unlabeled Data
8	Research On Classification Methods Based On Extreme Learning Machine
9	Bayesian multilevel analysis of binary time-series cross-sectional data in political economy
10	Research On Time-Evolving Based Machine Learning Methods