Font Size: a A A

Rank-based tempo-spatial clustering: A framework for rapid outbreak detection using single or multiple data streams

Posted on:2013-07-18Degree:Ph.DType:Dissertation
University:University of PittsburghCandidate:Que, JialanFull Text:PDF
GTID:1458390008976165Subject:Information Technology
Abstract/Summary:
In recent decades, algorithms for disease outbreak detection have become one of the main interests of public health practitioners as a way to identify and localize an outbreak as early as possible in order to inform further public health response to prevent a pandemic from developing. Today's increased threat of biological warfare and terrorism provide an even stronger impetus to develop methods for outbreak detection based on symptoms as well as definitive laboratory diagnoses. In this dissertation work, I explore the problems inherent to rapid disease outbreak detection using both spatial and temporal information. I develop a framework of nonparameterized algorithms which search for patterns of disease outbreak in spatial subregions of the monitored region within a certain period. Compared to the current existing spatial or tempo-spatial algorithm, the algorithms in this framework provide a methodology for fast searching of either a univariate data set or multivariate data set. It first measures how likely a study area has an outbreak occur given the baseline data and currently observed data. Then it applies a greedy searching mechanism to look for clusters with high posterior probabilities given the risk measurement for each unit area as a heuristic. The performance of the proposed algorithms is then evaluated. From the perspective of predictive modeling, I adopted a Gamma-Poisson (GP) model to compute the probability of having an outbreak in each cluster when analyzing univariate data. I built a multinomial generalized Dirichlet (MGD) model to identify outbreak clusters from multivariate data that include the OTC data streams collected by the national retail data monitor (NRDM) [1] and the ED data streams collected by the RODS system [2]. Key contributions of this dissertation include 1) the introduction of a rank-based tempo-spatial clustering algorithm, RSC, which utilizes greedy searching and a Bayesian GP model for disease outbreak detection with comparable detection timeliness, cluster positive prediction value (PPV) and improved running time; 2) the proposing of a multivariate extension of RSC (MRSC) which applies an MGD model. The evaluation demonstrates the advantage of the MGD model in effectively suppressing the false alarms caused by baseline shifts.
Keywords/Search Tags:Outbreak detection, Data, MGD, Model, Tempo-spatial, Framework, Algorithms
Related items