Font Size: a A A

Modeling And Statistical Inference For A Class Of Integer-valued Time Series And Longitudinal Data

Posted on:2021-01-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:X F LiuFull Text:PDF
GTID:1360330623977305Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
In real life,time series data with integer values driven by dependent variables are widely available.The study of such data is often a subject of interest in many natural and social sciences.Longitudinal data in real life,especially medical cost data,are also topics of interest to researchers.For example,the estimation of the medical costs from the diagnosis of the disease to the terminal event.Also such repeated measurements involve time-to-event variables with the interval censoring,right censoring as well as some extra information of covariates and errors.However,most of existing literature on such research focused on the estimation of cumulative mean function(CMF)for history process.Therefore,in this paper,we mainly study the modeling and statisti-cal inference of dependent-driven integer-valued time series process and the combined scheme of both inverse probability of censoring weighting(IPCW)technique and longi-tudinal quantile regression model is used to develop a novel procedure to the estimation of cumulative quantile function(CQF)based on history process with time-dependent covariates and right censored time-to-event variableFirstly,we introduce a popular thinning operator in the field of modelling integer-valued time series data based on the thinning scheme,which is the binomial thinning operator.We assume X is a non-negative integer-valued random variable,and let? ?[0,1).Then the binomial thinning operator denoted by "o" is defined by thefollowing equation,? o X=(?)Bi,whereas {Bi} is a sequence of i.i.d.Bernoulli random variables with success probability P(Bi=1)=1-P(Bi=0)=?.Next,for a fixed time point t,we introduce that let the conditional distribution function of V(t)given X(t)be F(v(t)|X(t))=P(V(t)?(t)|X(t)),and for 0<?<1,the conditional?th quantile of V(t)given X(t)be Q?(V(t)|X(t))=inf{k:F(k|X(t))??}.In the following part,we will introduce our main results of this thesis.1.Modelling and statistical inference of PoDDRCINAR(p)processFor integer-valued time series data with dependent events that may survive or disappear after a period of observation in a real life,based on the binomial thinning operator and Poisson innovation sequence,we propose a pth-order dependence-driven random coefficient integer-valued autoregressive process,devoted by PoDDRCINAR(p).It is defined as follows,Definition 1 A non-negative integer-valued process {Xt} given by(?)where {?t} is an i.i.d.non-negative integer-valued sequence with a Poisson-distribution po(?),which is said to be a pth-order Poisson dependence-driven random coefficient integer-valued autoregressive(PoDDRCINAR(p))process,if the following conditions are satisfied:(?)the joint distribution of {?t1,?t2,...,?tp} is given by(?)where ?0,?1,...,?p are non-negative and ?i=0 p ?i=1;(?)The time-varying coefficient {?ti,1?i?p} are i.i.d.random variables across the time points t;(?){?t} is an i.i.d.non-negative integer-valued sequence with a probability mass function f?>0,such that E(?t4)<?;(iv){?t} is independent with {?ti,1?i?p} and the random variable sequence {Bi,t}in the thinning operator;(v)the sequence of Bernoulli random variables {Bi,t} in the thinning term ?ti o Xt are independent with Xt-1,Xt-2,....The model defined with Equations(1)and(2)can be also represented in the fol-lowing form(?)and the following conditions are satisfied:(i)"w.p." means with probability and ?i ?(0,1);(ii)The counting series in ?i o Xt-i,i=1,2,···,p are named as survival processes and are mutually independent for all t ? Z for known Xt-i.By the following equation(?)We obtain the estimate ?.Next,we give the strong consistency and limit distribution of estimates ?CLS and ?CML in equation(4)of PoDDRCINAR(p)model by the following two theorems.Theorem 1 Let {Xt} be an PoDDRCINAR(p)process generated as in Equations(1)and(2)or(3).Then the estimator ?CLA given in Equation(4)is strongly consistent and jointly asymptotically normally distributed.Theorem 2 Suppose that {Xt} is the strictly stationary and ergodic solution of model defined by Equations(1)and(2)or(3).And,Assumptions 1 and 2 hold.Then,as n??,(?)there exists an unique estimator ?CMLsuch that ?CML??0 in probability;(?)(?)where(?)Assumption 1:(?).(?).(?).The parametric space (?) is compact with(?)={?|?=(?i,?2…?p,?1,?2,…,?v,?)-,???i??,?????,(i=1,2,···,p),?? ??? and ??(?) } where ?,?,?,?,?,?,? and ?are finite positive constants,and ?o is an interior point in (?).Assumption 2:If there exists a t satisfys t ?1,such that Xt(?0)?Xt(?),P?o a.s.,then ?=?0,where P?o is the probability mensure under the true pammeier ?0.Finite sample properties of the conditional maximum likelihood estimator are ex-amined in relation to the widely used conditional least squares estimator.It is conclud-ed that,conditional maximum likelihood method performs better in terms of bias and MSE.Finally,three real crime data sets are analyzed to the process compared with the combination PoINAR(p)process.The results show that the Poisson dependent-driven random coefficient integer-valued autoregressive process(PoDDRCINAR(p))performs better.2.Modelling and statistical inference of PoDDRCINAR(p)processWe extend the PoDDRCINAR(p)process to a more general process with the inno-vation sequence that is unknown.That is to say,when {?t} is unknown in the process defined by the equations(1)and(2)or(3),then {Xt} is called dependent-drive random coefficients integer-valued autoregressive(DDRCINAR(p))process.The following theorem gives the stationarity and ergodicity properties of the D-DRCINAR(p)process.Theorem 3 If ?i=1 p ?i?i<1 and the maximum absolute eigenvalue of E[At T(?)At]is less than 1,then there exists a unique stationary integer-valued random series {Xt}satisfying equations(1)and(2)or(3).Furthermore,the process is an ergodic process.Next,we consider three different estimation methods of parameters for the DDRCI NAR(p)process,namely the conditional least squares(CLS)method,the weighted conditional least squares(WCLS)method,and the maximum quasi-likelihood(MQE)method.The advantage of these three methods is that they do not need to specify the exact distribution family for the innovation sequence.And the consistency and asymptotic normality of the conditional least squares estimator and maximum quasi-likelihood estimator are given by the following two theorems.Theorem 4 Let {Xt} be an DDRCINAR(p)process generated as in equation(1)and(2)or(3)with the conditions given in Theorem 3.Then the estimates ? obtained from equation(4)will be strongly consistent and jointly asymptotically normally dis-tributed.Theorem 5 The joint limit distribution of the MQEs estimators(a1,??)is(?)where(?)where(?)The performances of these estimators are investigated and compared via simu-lations.Simulative analysis shows that maximum quasi-likelihood estimators(MQE)perform better than the estimators of conditional least squares(CLS)and weighted least squares(CLS)in terms of MSE and the proportion of within-? estimates in cer-tain regions of the parameter space ?.At last,the model is applied to two real data sets:Epileptic seizure counts analysis and Precinct rape counts analysis.The pro-posed process is compared with the fixed coefficient process,and it is concluded that the DDRCINAR(p)process performers better.3.Estimating cumulative medical costs based on quantile regressionThe combined scheme of both inverse probability of censoring weighting(IPCW)technique and longitudinal quantile regression model is used to develop a novel proce-dure to the estimation of cumulative quantile function(CQF)based on history process with time-dependent covariates and right censored time-to-event variable.Firstly,we give the following definitions:(i)K1(t)=P(T>t)as the survival function of T;(ii)K2(t)=P(C>t)as the survival function of C;(iii)K(t)=P(T*>t)as the survival function of T*=min{T,C};(iv)H?(t)=Q?(V(t)|X(t))as the ?th-quantile state function of V(t);(v)??(s)=?0s H?(t)dt as the CQF of the V(t)in the period[0,s].We next develop a natural estimator for the ?th-quantile state function H?(t)as below:(?)(5)Then,for any time point s?[O,L],the proposed estimator of cumulative ?th-quantile state function(CQF)for history process V(t)in the interval[0,s]is defined as(?)(6)We need the following regularity conditions needed for the derivation of asymptotic property of proposed estimator(CQF):(?)Conditional on Bx(t),Bv(t)and T ? t,for (?) t ?[0,L],the time-dependent covariate X(t)is completely observed.Then,the distribution function of X(t)is decided only by Bx(t).In addition,X(t)is continuously differentiable in[0,L]with probability one,and (?) ||X'(t)||<? where ||·|| and X'(t)represent the Euclidean norm in the true space and the derivative of X(t)with regard to time t,respectively.(?)For (?) t<L,the intensity of the counting process NC(t)is decided only by BX(t),X(t)and conditional on BX(t),BV(t),X(t)and T?t.(?)P(XTX)is positive and XTX is full rank.Moreover,there exists a constant vector Co such that X(t)TC0?g(t)for a deterministic function g(t)and all t ?[0,L]with positive probability,then Co=0 and g(t)=0.(?)?0?,representing the true value of parameter ??,fulfills ||?0?||?C1 where C1 is a known positive constant.(?)There exists a constant a>0 such that K1(L)? a and K2(L)?a.Now,according to the above conditions,we can obtain the consistency of the proposed estimator of the CQF ??(s)in the following theorem.Theorem 6 Based on conditions(?)-(?),the estimator ??(s)from Equaion(6)of??(s)is consistent for (?) s ?[0,L].Then,according to the analysis in simulation section,we conclude that the cu-mulative quantile function(CQF)performs better,when the state process V(t)has outliers.Finally,a medical cost data from a multicenter automatic defibrillator im-plantation trial(MADIT)is analyzed to illustrate the application by developed method.
Keywords/Search Tags:DDRCINAR(p)process, inter-valued time series, cumulative quantile function, medical cost data, longitudinal quantile regression
PDF Full Text Request
Related items