Font Size: a A A

Extreme value theory-based P values in time series outlier detection

Posted on:2006-10-10Degree:Ph.DType:Thesis
University:The University of Wisconsin - MadisonCandidate:Munoz del Rio, AlejandroFull Text:PDF
GTID:2458390008467255Subject:Statistics
Abstract/Summary:
This thesis examines how extreme value theory can aid in obtaining approximate P values for a class of outlier detection algorithms in time series analysis (Chang et al. 1988). Additive and innovational presses are added to the set of perturbations commonly considered. Chapter 1 makes the case for both changes. Chapter 2 presents outlier detection in an intervention analysis framework. This leads to outlier effect estimates based on linear regression. Since the estimates are a dependent Gaussian process, the asymptotic distribution of their maximum can be obtained using Extreme Value Theory. Berman's condition (Leadbetter et al. 1983) is shown to hold for innovational perturbations, which were chosen due to their tractability; this is used to establish that the outlier detection test statistic has the same asymptotic Gumbel distribution it would have if the tests statistics were independent. Chapter 3 examines the quality of the asymptotic distribution for finite samples, in the presence of varying degrees of dependence. Five methods to correct for dependence are presented: Hsing's trivariate approximation (McCormick and Reeves 1988), improved Bonferroni bounds (Worsley 1982), estimates of the error in the Gumbel approximation (Rootzen 1983), a probability generating function approach (McCormick and Reeves 1988), and a cluster index (Hsing et al. 1996). The approximations are compared in an example and in a Monte Carlo study. In Chapter 4, the Chen and Liu (1993) outlier detection algorithm is modified to detect presses and to base detection on P values based on the McCormick and Reeves correction. The performance of the modified algorithm is assessed via simulation, and by analyzing three actual time series. Outlier detection based on P values which reflect the dependence structure of the perturbations and the particular ARMA structure being modelled is found to compare favorably with existing methods, which rely on fixed thresholds. However, the computation of the approximate P values runs into occasional numerical problems, particularly for large (n = 250) sample sizes. Chapter 5 lists conclusions and directions for future work.
Keywords/Search Tags:Outlier detection, Extreme value, Values, Time series, Chapter
Related items