Font Size: a A A

Modelling Research For Incomplete Time Series And Longitudinal Data

Posted on:2024-05-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:W XiongFull Text:PDF
GTID:1520307064474134Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Time series data and longitudinal data widely exist in a number of research fields.Nevertheless,one may collect data incompletely by various factors in practice.So far,the assumption of a simple response mechanism or a known model structure is commonly considered to analyze such kind of data,which is sensitive to the failure of the efficiency in terms of complex mechanisms or misspecification.For instance,stronger restrictions are required for applying the same modelling method for data missing at random(MAR)to the case of missing not at random(MNAR)since the MNAR response probability depends on the missingness.Moreover,since the distribution of the response mechanism and observations are difficult to be correctly specified in terms of complex practical conditions,most of estimations are sensitive with model misspecification.Therefore,it is urgent need to innovate a few of candidate models or use nonparametric structure to improve accuracy during modelling,and contribute more robust methods to ensure the consistency of the estimator.In this thesis,varieties of modelling researches for autoregressive process with missing data and longitudinal quantile regression with right-censored history process are studied.Firstly,when the observations of autoregressive model are MNAR,a method of combining first-step imputation and semiparametric estimation is proposed.Secondly,this thesis propose an innovative idea of multiply robust estimation to handle sparse coefficients for high-order autoregressive process with MAR explanatory variables.Finally,this thesis propose an exponential aggregation algorithm for longitudinal quantile regression and apply it to make estimation and prediction in the sense of right-censored history process.The main methodologies and results of the thesis are introduced as follows:1.Semi-imputation estimation for integer autoregressive model with data missing not at randomIn this thesis we study the estimation for first-order Poisson integer autoregressive(PINAR(1))model with MNAR observations.Under the correct parametric response mechanism,this thesis incorporates observed likelihood and first-step imputation to propose a semiparametric estimation.Since one can hardly search a set of completely observed "covariate" in PINAR(1)as well as missing time points are unfixed,this thesis firstly constructs a sequence of“fully observed”data via imputation to overcome the computational difficulty of the integral in the likelihood function.In addition,a nonparametric regression is used to estimate conditional expectations in score functions as the nondeterminacy of the distribution of Xt|(Xt-1,δt=0)in the MNAR case.Theoretically we prove the consistency and asymptotic normality of proposed estimators,and compare the efficiency between different first-step imputation methods by numerical simulations.It can be seen that although it partly depends on the choice of first-step imputation,the proposed methodology performs generally better than imputation estimation.Further,simulations of estimating parameters for INAR(1)with other innovations and MNAR data are carried out to verify the versatility of the methodology,and we apply it to analyze Pittsburgh monthly crime data.2.Penalized multiply robust estimation in high-order autoregressive model with missing explanatory variablesThis thesis studies the parameter estimation in high-order autoregression with dependent MAR 0-1 explanatory variables.Since the conditional distribution of explanatory variables and the response mechanism may be misspecified in practice,and incorporating a number of sparse lags will affect the prediction accuracy in high-order cases,we define the“true structure”model to generalize the correct specification,and further propose the penalized multiply robust estimation equation(PMREE)for autoregressive coefficients,which is constructed by variable selection and multiply robust estimation.Theoretically one can verify that when the“true structure”model is contained in above candidates,PMREE can both select insignificant terms and guarantee the consistency and asymptotic normality of significant estimators,i.e,the estimator is multiply robust and has Oracle properties.Besides,the selection criterion of the tuning parameter is modifies in the robust framework,and we verify the convergence.In the sense of numerical computation,This thesis adopts the idea of local quadratic approximation to make iteration,and to estimate tuning parameters and shape parameters in penalized functions during the data-driven optimization algorithm.The performance of the proposed method is studied via simulations,and we compare it with general multiply robust estimation to demonstrate the rationality.Finally,the methodology is used to fit the U.S.industrial production index data,the applicability is shown by estimating and predicting for the fitted model.3.Adaptive aggregation for longitudinal quantile regressionThis thesis proposes an aggregation estimation for longitudinal quantile regression model,which uses exponential aggregation weighting(EAW)algorithm to weight all candidate procedures of the conditional quantile.To overcome the theoretical difficulty caused by the quantile loss function,we propose a secondary smoothing strategy for the loss function,and demonstrate the strong convexity and L-smoothness on the definition domain.Furthermore,this thesis considers the estimation and the prediction for additive quantile mixed effect model with right-censored history process.When the fixed effect part has multiple candidate procedures,this thesis combines the proposed algorithm with inverse probability censored weighting estimation to obtain the estimator of cumulative quantile function(τ-CQF),and modify the best linear prediction in Geraci(2014)via EAW.The risk bounds of EAW estimator under original quantile loss and squares error are derived by Oracle inequalities,and we prove that when the set of candidates contains a consistent procedure of the fixed effect model,τ-CQF estimator is consistent.This thesis designs a number of simulation studies to check the efficiency of fitting,estimating and predicting of EAW algorithm,and we analyze the medical cost data from multicenter automatic defibrillator implantation experiment(MADIT)to verify the practicability of proposed methodology.The methodologies proposed in the thesis enrich the research of robust estimation in the area of incomplete data,which is helpful for modeling and analysis in a various fields such as economics,medicine,biostatistics,thereby improving the accuracy of fitting and predicting of the model.
Keywords/Search Tags:Incomplete data, time series, longitudinal quantile regression, robust estimation, variable selection, model aggregation
PDF Full Text Request
Related items