Font Size: a A A

Spatial-temporal Bayesian analysis for over-dispersed count data

Posted on:2005-08-08Degree:Ph.DType:Dissertation
University:State University of New York at AlbanyCandidate:Chen, HaiyanFull Text:PDF
GTID:1458390008480893Subject:Biology
Abstract/Summary:
Over-dispersion is a common phenomenon observed in count data of human disease, which requires for more flexible probability models than Poisson. This dissertation proposes two new test statistics: the Simple Linear Test (TS) and the Log-Linear Test ( TL), to identify when the Poisson model fails. Simulation studies show that TL, but not TS , has the correct rejection rate and essentially the same power as the classical Fisher-Bohning's Statistic TF for standard alternatives to the Poisson. Although the TL test exploits only the hypothesized equality between mean and variance, this approach can be extended to tests of other relationships between mean and variance.; Non-trivial spatial and/or temporal relations and variations on disease occurrence and rates are often understood via maps, which less is known about how to statistically analyze. This dissertation investigates three approaches used for this purpose: the crude rate estimates, the Empirical Bayes Standardization (EBS), and the full Bayesian hierarchical regression model. Although the crude rate approach is the simple first step to estimate the incidence rate, mapping of crude rates can be misleading when the population sizes for the regions vary widely. The EBS approach has been proposed to correct this weakness by borrowing-strength-from-the-ensemble, which could lead to over-shrinkage of the crude rates towards their grand mean. The full Bayesian approach outperforms the EBS and provides greater flexibility in permitting variability in the data due to various mechanisms (e.g., interactions in various hierarchies within the region context).; The above studied statistical techniques then are applied to a data set named as the NYSLD data which was built from the 1990--2000 New York State (NYS) Lyme Disease (LD) data, the 1990--2000 estimated populations for NYS counties from the United States Census Bureau, and the spatial data from Geographic Information System of the Environment Service Research Institute. The new TL test shows a Poisson model fails to fit the NYSLD data while the Negative Binomial seems a natural alternative. The full Bayesian Negative Binomial approach confirms and refines findings observed in other two approaches that LD increases in the NYS and has a diffusion-wave like movement from southeast toward north.
Keywords/Search Tags:Data, NYS, Bayesian, Approach
Related items