Font Size: a A A

Cumulative logit - Poisson and cumulative logit - negative binomial compound regression models for count data

Posted on:2009-12-30Degree:Ph.DType:Dissertation
University:The George Washington UniversityCandidate:VanRaden, Mark JosephFull Text:PDF
GTID:1440390002495889Subject:Biology
Abstract/Summary:
Count data arise in many settings and often can be analyzed by Poisson regression, with covariates predicting each individual's true mean rate. The mean rate times the length of exposure completely specifies the Poisson probability of every possible count. To handle various complications that arise in practice, many adaptations to this basic structure have been devised. For example, the observed probability of a zero count may diverge from the Poisson. Often, more zeroes occur than the Poisson predicts, but sometimes fewer occur. Models that handle these deviations are well developed and are frequently applied. These include the zero inflated Poisson (ZIP), zero altered Poisson (ZAP), Poisson hurdle (PH) and others. In other cases, the Poisson underpredicts the true variance, but negative binomial (NB) regression that generalizes the Poisson resolves this. Adaptations for zero counts work similarly, yielding analogous acronyms ZINB, NBH, etc.; In yet other cases, data may behave like Poisson or NB but significant deviations extend slightly beyond zero. The literature apparently offers only two such adaptations of Poisson regression (namely, alterations for two counts; Silva and Covas, 2000; Melkersson and Rooth, 2000). Neither provides parsimonious generalizations for alterations at multiple counts.; This dissertation combines a proportional odds ordinal regression for the lowest counts with a conditionally Poisson or NB structure for all higher counts. This offers flexibility while permitting parsimony and interpretability. The model parameters are estimated by maximum likelihood, given a specified point of transition from proportional odds to Poisson or NB. The transition point may be determined by prior knowledge, by analysis goals or by exploratory fitting. Exposure time is incorporated in both the lower and upper parts. Parameter estimates are shown to be asymptotically normal, justifying use of likelihood based chi square statistics. The expectation and variance are computed, yielding measures of influence and of general over- or underdispersion. A Pearson type goodness of fit test is available.; The model is fit to a well studied dataset of episodes of severe hypoglycemia (Lachin, 2000, among others), illustrating its potential. Advantages, limitations and areas for potential future research are described.
Keywords/Search Tags:Poisson, Regression, Count
Related items