Font Size: a A A

Quasi Likelihood Inference And Variable Selection Procedure For Some Classes Of Integer-valued Time Series Models

Posted on:2018-11-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Y WanFull Text:PDF
GTID:1310330542450128Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
The integer time series data abounds in daily life,such as the number of transaction in futures trading,the number of patients in a hospital,the monthly number of insurance claim,and so on.There are a lot of integer time series which are small in value and show a trend having relatively large fluctuation in practice.We usually consider using differencing to eliminate time trend and seasonality from the data.However,the differenced data is still integer-valued but can be negative-valued.This kind of data cannot be described by the integer autoregressive mode?INAR?with binomial thinning operator which proposed by Alzaid,Al-Osh?1987?,Alzaid,Al-Osh?1990?.Because these type of integer autoregressive models only can describe the nonnegative integer time series data.In order to solve this problem,Kim,Park?2008?proposed a p order integer-valued autoregressive with signed binomial thinning operator?INARS?p??.On this basis,Zhang et al.?2010?extended the signed binomial thinning operator and proposed a so-called signed generalized power series thinning operator which is defined as follows: where X is an integer-valued random variable which obeys generalized power series?GPSD?,Wjindependent identically distributed generalized power series sequence,E?Wj?= |?|,V ar?Wj?= ?,Cov?Wj,X?= 0,the definition of signed function sgn?x?is{Remark 1?i?The probability distribution function of discrete random variable X obeys GPSD is where T is a subset of the set of non-negative integers,a?x?> 0,g?y?and f?y?is positive valued and finitely differentiable.?ii?GPSD include the usual Poisson distribution,binomial distribution,negative binomial distribution,and so on.Furthermore,Zhang et al.?2010?proposed a p order integer-valued autoregressive model with signed generalized power series thinning operator?GINARS?p??.Definition 1 If the process {Xt} satisfys the following equation then we call the model as GINARS?p?,where ?i Xt-i= sgn??i?sgn?Xt-i??|Xt-i|j=1W?i?j,{W?i?j} is i.i.d.GPSD sequence,E?W?i?j?= |?i|,V ar?W?i?j?= ?i,1 ? i ? p.{Zt}is i.i.d.integer-valued random variable sequence and independent of W?i?j,E?Zt?= ?,V ar?Zt?= ?2z.Cov?Zt,Xt-i?= 0,i ? 1,and all the count sequences in?1?are independent of {W?i?j}.We setFollowing the theorem 2.1 of Zhang et al.?2010?,when all the eigenvalues of matrix A are inside the unit circle,then there exists a unique strictly stationary integer-valued random series {Xt} satisfy: According to the proposition 2.1 of Zhang et al.?2010?we know that,if {Xt} is a stationary process satisfy model?1?,thenWe study the maximum quasi likelihood of GINARS?p?based on moment estimation.Before we introduce the content,we need give some assumption congditons:?A.1?{Xt} is a strictly stationary and ergodic process.?A.2?E|Xt|4< ?.We assume {X1,X2,...,Xn} are n samples generated from model?1?,the parameter vector of GINARS?p?is ? =??1,...,?p,??T.We set ? =??1,...,?p,?2z?T,so the conditional variance can be written as p Based on the quasi likelihood method proposed by Wedderbrun?1974?,we can establish the following standard equation set:Through solving the equation set?2?,we can get the maximum quasi likelihood estimator of GINARS?p?whereThe above content give the display expression of??.Now we begin to study the limit theoretical properties of??.Theorem 1 We assume conditions?A1?-?A2?hold.For the MQL estimator??given by the equation set?2?,as n ? ? whereFrom the result of theorem 1,?? is a consistent estimator of ?.Next,with the help of moment estimation,we use the conditional least squares?CLS?estimator?? =???1,...,??p,???Tto give a consistent estimator of ?2z. Theorem 2 Under the conditions?A.1?-?A.2?,a consistent estimator of ?2zis where ???,???.In practice,we can use the result of theorem 2 to get the variance of the error and then to get the conditional variance of Xt.Now we want to discuss the region estimation of ?.In the following theorem,we use the normal approximation to construct the quasi likelihood confidence region of ?Theorem 3 We assume?A1?-?A2?hold,for 0 < ? < 1,the 100?1-??% confidence region of ? is:where ??? is a consistent estimator of T???,?2?p+1????denotes the ?-upper quantile of ?2distribution with degrees of freedom p + 1.We usually introduce the variable as many as we can when modeling the integervalued time series data in order to retain the significant information.And then in the practice,if the lag of the model is very large,the model may include some insignificant variable.It will cause the model become more complicated and less accurate.Therefore,we think to use the variable selection to model the integer-valued time series data.We consider the INAR?1?model proposed by Alzaid,Al-Osh?1987?which states as follows: where “ ? ” is binomial thinning operator proposed by Steutal,Van Harn?1979?.The definition of this operator is: where Yt-iis non-negative integer-valued random variable,{B?i?j} is independent identically distributed Bernoulli random variable sequence,this sequence is independent of Yt-iand satisfys P?Bi= 1?= ? = 1-P?Bi= 0?.{Zt} is independent identically distributed Poisson random variable sequence with parameter ?.Ztis independent of Yt-iand all the count series {B?i?j}.To describe the covariant influencing the variance of the error at different time,we can replace the error item by where ? =??1,...,?p?Tis parameter vector,Xt=?X1t,...,Xpt?Tis covariant.We set? =??,?T?Tas the parameter vector of the model.Following the definition,we can derive the conditional least squares?CLS?criterion function of the INAR?1?with covariant model is:We set mt???=-1/2 ·?St???/??.Then the solution of?n t=1mt???= 0 is the CLS estimator??.To select the significant variable,we can use the penalized estimation.Following Zou?2006?,we can minimize to derive the penalized estimator??.P??|?i|?is adaptive LASSO proposed by Zou?2006?,and its definition is: where ? is threshold parameter,the weight wiis 1/|??i|r,r > 0 is shape parameter.Due to the effect of the weight,the significant variable only suffer slight affection when the penalized function delete the insignificant variable.We will introduce some theoretical properties of?? in the following content.For convenience,we set the true value is where ?10,?20represent the nonzero and zero component of ?0,respectively.We define and where s is the component number of ?10,?P?·?and¨P?·?represent the first and second derivative of P?·?,respectively.??We set ? = E??mt??0?/???,? = E mt??0?mT t??0?whereMore over,we need introduce three regularity conditions:?i?The process {Yt} is strictly stationary and ergodic and E|Yt|4< ?.?ii?an= O(n-1/2).?iii?bn= o?1?.Condition?i?ensure the INAR?1?with covariant model has consistent estimator.Condition?ii?guarantee the penalized estimator?? is?n consistent.Condition?iii?is used to make sure that the influence of penalty function does not exceed that of CLS.Now we begin to introduce the theoretical properties of??.Theorem 4 Under the conditions?i?-?iii?,there exists a local minimizer?? of Qn???such that ???-?0? = Op?n-1/2+ an?.?Theorem 4 indicates there exist a?n consistent penalized estimator of true value?0.The following lemma 1 indicates?? is sparsity.Lemma 1 We assume that lim infn??lim inf?j ?0+ ?-1n?P?n?|?j|?> 0 and the conditions?i?-?ii?hold,so with probability tending to 1,for any given ?1satisfying??1-?10? = Op?n-1/2?and any constant ? > 0,we haveNote that lim infn??lim inf?j ?0+ ?-1n?P?n?|?j|?= 1/|??j|r> 0,then the assumption in lemma 1 is reasonalbe.At last,we want to introduce the oracle properties of??.We use the technical method in Fan and Li?2001?to establish the results which are stated as follows.Theorem 5 Under the conditions?i?-?ii?,with probability tending to 1,the root-n consistent estimate in Theorem 4 satisfies:?a?Sparsity :???= 0;?b?Asymptotic normality :where ?sand ?srepresent the submatrix of ? and ? with respect to ?10,respectively.In practice,if {Yt} is strictly stationary and ergodic,then according to the ergodic theorem we know are consistent estimator of ? and ?,respectively.Based on the result of theorem 5,we can use??sand??sto estimate the covariance matrix of???.Moreover,we consider the variable selection procedure for the Poisson autoregressive model with sparse structure.Ferland et al.?2006?extended the idea of conditional heteroskedasticity to integer-valued time series data and proposed the Poisson autoregressive model which is used to model the number of cases of campylobacterosis infections from January 1990 to the end of October 2000 in the north of the Province of Qu`ebec.The definition of the model is stated as follows:where Ft-1is the ?-field generated by {Xt-1,Xt-2,...} and ?0> 0,?i? 0,i = 1,...p.For convenience,let ? =??0,?p,...,?1?T,.Following Zhu and Wang?2011?,the conditional log-likelihood function for model?4?isMotivated by?3.14?in Fan and Lv?2010?,we propose the PCML function as where Ln???is defined in?5?,and P??·?is a penalty function.Here we pay attention to the following four penalty functions,which have the oracle property?Fan and Li,2001?:?P.1?The SCAD penalty function?Fan and Li,2001?is defined by where ? > 0 is the tuning parameter,and a > 2 is the shape parameter.?P.2?The adaptive LASSO?Zou,2006?is defined as P??|?i|?= ?wi|?i|,where ? > 0 is the tuning parameter,r > 0 is the shape parameter.Of note that the weight wiis defined as wi= 1/|??i|r,where??iis the CML estimator.?P.3?The MCP?Zhang,2010?is defined as follows,where ? > 0 is the tuning parameter,and ? > 0 is the shape parameter.?P.4?Dicker et al.?2013?gave the definition of SELO function, where ? > 0 is the tuning parameter,and ? > 0 is the shape parameter.The PCML estimator is obtain by maximizing Qn???,which is defined asIn the next,we will study some theoretical properties of the PCML estimator,which include the consistence and oracle properties.We define the true value of model?4?is ?0=??00,?0p,...,?01?T.Without loss of generality,we assume that ?0=??10,?20?T,where ?20= 0.We set andwhere s is the number of components in ?10,?P?·?and¨P?·?denote the first and second derivative of the penalty function P?·?,respectively.?nindicates ? depends on the sample size n.To study the theoretical properties of??,we need the following regularity conditions:Here?C.1?ensures that {Xt} is strictly stationary and ergodic?Doukhan et al.2012?,which is used for the asymptotic properties of??.Zhu and Wang?2011?proved that for any positive integer m,E?Xm t?< ? if and only if?C.1?holds.?C.2?is to ensure that the estimator is?n-consistent.?C.3?is used to make sure that the influence of penalty function does not exceed that of CML criterion function on the resulting estimator.To check the rationality of?C.2?and?C.3?with SCAD penalty?other penalties are similar?,by some calculation we can derive that and Then,the classical condition for penalty-based procedure?Fan and Li,2001?with?n= O?n-1/2?can ensure the?C.2?and?C.3?hold.Now we focus on the properties of PCML estimator,which are given below.Theorem 6 Under the conditions?C.1?-?C.3?,there exists a local maximizer??of Qn???such that ???-?0? = Op?n-1/2+ an?.?The above theorem implies that there exists a?n-consistent estimator for ?0.To establish the sparsity of PCML estimator,we need the following lemma.Lemma 2 We assume that lim infn??lim inf?j ?0+ ?-1n?P?n?|?j|?> 0 and the conditions?C.1?-?C.3?hold,so with probability tending to 1,for any given ?1satisfying??1-?10? = Op?n-1/2?and any constant ? > 0,we haveNote that the assumption lim infn??lim inf?j ?0+ ?-1n?P?n?|?j|?> 0 is mild,since lim infn??lim inf?j ?0+ ?-1n?P?n?|?j|?= 1?SCAD penalty;other cases are similar?.We discuss the oracle property of PCML estimator.The oracle property,proposed by Fan and Li?2001?,means that the penalized estimation method performs as well as if the true model was known in advance.The following theoretical result provides the oracle property of PCML estimator??.Theorem 7 Under the conditions?C.1?-?C.3?,with probability tending to 1,the root-n consistent estimate in Theorem 6 satisfies:?i?Sparsity:???2= 0;?ii?Asymptotic normality :where ?s??0?represents the Fisher information????0?is defined in Lemma 4.5.1?with ?20= 0.The result of Theorem 7 indicates the PCML estimator has oracle property.Note that {Xt} is a strictly stationary and ergodic process,then by ergodic theorem,we have that??s??0?=1n?n t=11?t Ys tYs t T a.s.-? ?s??0?,where Ys t=?1,Xt-p,...,Xt-p+s-2?T.Therefore,the consistent covariance estimate for??? is.
Keywords/Search Tags:integer-valued autoregressive model, Poisson autoregressive, quasi likelihood, variable selection
PDF Full Text Request
Related items