Font Size: a A A

New Models and Methods for Applied Statistics: Topics in Computer Experiments and Time Series Analysi

Posted on:2018-09-09Degree:Ph.DType:Dissertation
University:Rutgers The State University of New Jersey - New BrunswickCandidate:Zhao, YiboFull Text:PDF
GTID:1478390020457499Subject:Biostatistics
Abstract/Summary:
In applied statistics, people develop models to solve real world problems based on data. However, the data is growing fast and become more and more massive and complex. Conventional models are limited in the capability of dealing with the fast growing data. This dissertation develops two new models in computer experiments and time series analysis. The new models are developed based on the special features of two real-world problems. The two datasets are from an IBM data thermal study and a biological cell adhesion experiment.;For computer experiment, we address two important issues in Gaussian process (GP) modeling. One is how to reduce the computational complexity in GP modeling and the other is how to simultaneous perform variable selection and estimation for the mean function of GP models. Estimation is computationally intensive for GP models because it heavily involves manipulations of an n-by-n correlation matrix, where n is the sample size. Conventional penalized likelihood approaches are widely used for variable selection. However, the computational cost of the penalized likelihood estimation (PMLE) or the corresponding one-step sparse estimation (OSE) can be prohibitively high as the sample size becomes large, especially for GP models. To address both issues, this article proposes an efficient subsample aggregating (subagging) approach with an experimental design-based subsampling scheme. The proposed method is computationally cheaper, yet it can be shown that the resulting subagging estimators achieve the same efficiency as the original PMLE and OSE asymptotically. The finite-sample performance is examined through simulation studies. Application of the proposed methodology to a data center thermal study reveals some interesting information, including identifying an efficient cooling mechanism.;Motivated by an analysis of cell adhesion experiments, we introduce a new statistical framework within which the unique features are incorporated and the molecular binding mechanism can be studied. This framework is based upon an extension of Markov switching autoregressive (MSAR) models, a regime-switching type of time series model generalized from hidden Markov models. Standard MSAR models are developed for the analysis of individual stochastic process. To handle multiple time series processes, we introduce Markov switching autoregressive mixed (MSARM) model that simultaneously models multiple time series processes collected from different experimental subjects as in the longitudinal data setting. More than a simple extension, the MSARM model posts statistical challenges in the theoretical developments as well as computational efficiency in high-dimensional integration.
Keywords/Search Tags:Models, Time series, Data, Computer, Experiments
Related items