Font Size: a A A

Properties of hypothesis tests using generalized additive models with smoothers of geographic location in spatial statistics

Posted on:2011-11-23Degree:Ph.DType:Thesis
University:Boston UniversityCandidate:Young, Robin LynneFull Text:PDF
GTID:2460390011471896Subject:Biology
Abstract/Summary:
An important problem in spatial epidemiology is how to measure and identify variation in disease risk across a geographic study region. The problem has two parts. Spatial variation must be detected and areas of increased/decreased risk must be located. One statistical method is generalized additive models (GAMs), a regression method that allows nonparametric associations between outcomes and predictors. In spatial statistics, GAMs apply a bivariate smoother to account for geographic location as a predictor of disease status. The proportion of data assigned non-zero weights, the degree of smoothing, is called the span size. A natural hypothesis is whether location is associated with disease, i.e. whether the smoothing term is necessary. An approximate chi-square test (ACST) is available but its statistic is known to be biased. A reasonable alternative is the conditional permutation test (CPT) where the span for observed data is determined through minimizing Akaike's Information Criterion (AIC) across models applied to a range of spans. The selected span is held constant for models applied to permuted datasets. The size and power of ACST and CPT have yet to be evaluated.;We proposed two alternative permutation testing methods, the fixed span (FSPT) and unconditional (UPT) permutation tests, differing from CPT in span selection techniques. The span for FSPT is selected a priori, while for UPT, spans for observed and permuted data are selected independently by minimizing the AIC. We used simulated data under null and simple alternative hypotheses to evaluate the size and power. We found that ACST and CPT had inflated type I error rates while FSPT and UPT error rates were near the nominal value. FSPT had the highest power estimates and UPT had the lowest.;We proposed a multiple span permutation test to evaluate models applied with 3-5 spans, avoiding the a priori single span selection of FSPT. It was compared to the spatial scan statistic, along with a size-corrected CPT, through application to synthetic data generated under three alternative hypotheses. Across all scenarios, GAM methods had similar or greater power and sensitivity (probability of detecting exposure source location) exceeding the spatial scan statistic estimates.
Keywords/Search Tags:Spatial, Location, Statistic, Geographic, Models, CPT, FSPT, Test
Related items