Font Size: a A A

Statistical Inference For Mixed Integer-valued Time Series Models

Posted on:2020-12-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Q YangFull Text:PDF
GTID:1360330575978814Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Integer-valued time series counts are formed by the state of a certain statisti-cal indicator of a phenomenon at different times.Series of these counts appear very frequently.It is the Integer-valued data formed by the state of a certain statistical in-dicator of a certain phenomenon at different times.This type of data is widely used in communication security,medical and health care,law and society,actuarial insurance and many other fields.For example mobile traffic in a certain area in a certain period of time,the number of traffic accidents of a certain type,the number of hospitalized patients,the number of crimes,certain danger the number of settle a claim of every months in the city etc.More and more attention has been paid to the study of integer time series.It is noted that this kind of data usually shows short-term dependence and takes non-negative integer-values,so fitting these data with a general time series model usually produces abnormal prediction.Therefore,compared with the traditional time series data with continuous real values,it is more difficult to study the integer-valued data.At present,there are two main modeling methods for integer time series data:(1)the state space modeling process based on latent process is proposed by Fukasawa and Basawa(2002).(2)modeling process by means of thinning operators.The introduction of binomial thinning operators is the cornerstone of the whole time series development.Steutel and van Harn(1979)proposed a integer-valued autoregressive process based on a thinning operator generated by counting series of Bernoulli-distributed random variables.aoX=X?i=1 Bi,is called the binomial thinning operator,where a ?(0,1),X is the non-negative random variable,{Bi} are i.i.d.sequence of Bernoulli distributed random variables and are independent of all X,where P(Bi = 1)= 1-P(Bi = 0)= a The binomial thinning operator has an limitation:?Bi} is i.i.d.Bernoulli random vari-able sequence,the values can be 0 or 1.But the ao X value is less than or equal to X always.But in real data,for example,the data of a certain type of infectious disease in a certain period,the number of a certain type of crimes,etc.,each event may be associated with more related counting events.Therefore,it seems more appropriate to represent these events in terms of Geometric random variables.So,Aly and Bouzar(1994)and Ristic et al.(2009)introduced the negative binomial thinning operator "*",?*X =(?)Wj,where ??(0,1),X is a nonnegative integer-valued random variable,{Wj} is i.i.d.Geometric(?/(1+?))random variable sequence and independent with X,P(Wj = k)=?k/(1+?)k-1,k? N0.For the {Wj} is a non-negative integer-valued sequence,then ?*X is more than or equal to X,this is a good way to break through the limitations of the binomial thinning operator.This negative binomial thinning op-erator can well explain the over-scattered counting process such as the incidence data of infectious diseases.But some random events may be at a certain probability in an observation period stranded or disappear or they will become very active at a certain probability,and more random events after a period of time.For example,in a certain area of a certain town,a certain criminal act has the destructive power to generate one or more new cases.The single operator can not fitting these data very well.Nastic and Ristic(2012)introduced a first-order mixed integer-valued autoregressive model:where ?,??(0,1).Noticed that Nastic and Ristic(2012),Ristic and Nastic(2012)considered the conditional least squares estimator and Yule-Walker estimator of pa-rameters in mixed integer-valued autoregressive model,we propose the empirical likeli-hood method;the weighted conditional least squares method and the maximum quasi-likelihood method and parameter estimation method under missing data of p-order mixed integer-valued autoregressive model.The main content is divided into four parts in this paper.In the first part,in order to further understand the properties of the first order mixed integer-valued autoregressive model,the empirical likelihood ratio statistics and its limit distribution of the parameters of the first-order mixed integer autoregressive model and the confidence domain are given based on the empirical likelihood method.The coverage of parameter confidence region is studied by numerical simulation.First-order mixed integer-valued autoregressive model is given.Definition 1.First-order mixed integer-valued autoregressive model:where ?,??(0,1).where ? otX=(?)Bi(t),and {Bi(t)}is i.i.d.Bernoulli random variable sequence,and P(Bi(t)=1)=1-P(Bi(t)=0)=?.The negative binomial thinning operator is given below:?t*X=(?)Wj(t),{wj(t)}is i.i.d.Geometric(?/(1+?))random variable sequence,and {?t} is i.i.d.random variable sequence,Xt-1 and ? ot Xt-1 and ?t*Xt-1 are independent.Let E(Xt)=?,?=(??),assuming that(C1)E(Xt4)<?,(C2)Xt is strictly stationary and ergodic.Q(?)=(?)[Xt-E(Xt|Xt-1)]2=(?)(Xt-?Xt-1-(1-?)?)2,Let(?)Q(?)/?=0,we can get the CLSE of ?.The ELR function is R(F)=L(F)/L(Fn),and R(F)=(?)=npi.LetNole thatmt(?)=(Xt-?Xt-1-(1-?)?)(?-1-Xt-1+?),then the estimating equation is(?)mt(?)=0.Let p1,…,Pn be non negative numbers and(?)pt=1.Let's consider the time series {Xt} has a non-negative marginal distribution with mean ?.We get the profile ELR function asThe maximum may be found via the Lagrange multiplier method.The log ELR statistic is l(?)=-2logR(?)=2(?)log(1+bT(?)mt(?)).Some lemmas are proposed in the following.Lemma 1.Under(C1)and(C2),we have(?)(Xt-?Xt-1-(1-?)?)(?-1-Xt-1+?)(?)N(0,M),where M = E[(Xt-?Xt-1-(1-?)?)2(-Xt-1+? ?-1)(-Xt-1+?,?-1)].Lemma 2.Under(C1)and(C2),we have 1/n(?)mt(?)mtT(?)(?)M.Lemma 3.Under(C1)and(C2),let Z*n=max 1?t?n||mt(?)||.Then we know that Z*=op(n1/2).Lemma 4.Under(C1)and(C2,),let b=pv,where??0,||v||=1,then b=Op(n-1/2).Theorem 1.Under(C1)and(C2),as n??,we have(?) l(?)d?x2(2),where l(?)is defined before.Now,the confidence region for parameter ? of this MINAR(1)model is constructed by theorem 1.For 0<?<1,the 100(1-?)%confidence region is given by (?)where x21-?(2)is the upper ?-quantie of the chi-squared distribution.From the Monte Carlo study,we can see that the magnitude of the coverage probability of the confidence region increase with increase in sample n,which are presented.The simulation results indicate that these coverage probabilities tend to the confidence level 0.95,thus the empirical likelihood method is acceptable,and performs very well.In the second part,we give the weighted least squares estimator and maximum quasi-likelihood estimator of the p order mixed integer-valued autoregressive model.The estimated bias and mean square error are compared by numerical simulation.In general,conditional least squares estimators are not asymptotically efficient.For Var(Xt\Ft-1),Cov(Xt,X2t|Ft-1)and Var(Xt2|Ft-1)depend on(Xt-1,Xt-2,Xt-3,…,Xt-p)T,we consider the weighted condition least squares estimation(WCLS)to improve the effect of the estimator.Definition 2.Mixed INAR(p)process(MINAR(p)):(?)where ? ?(0,1),(?)(?)i=1.where {Bi(t)} is an independent and identically distributed(i.i.d.)Bernoulli random sequence,and P(Bi(t)=1)=1-P(Bi(t)=0)=?.where {Wj(t)} is an i.i.d.sequence of Geometric(?/(1+?))random variables,{?t}is a sequence of i.i.d.random variables and ?t is independent of Xt-m,?ot Xt-m and?*t Xt-m,where m=1,…,where m = 1,…,p.The necessary and sufficient condition for the existence of a stationary and ergodic MGINAR(p)model is 0<?<?/(1+?).We consider weighted CLS(WCLS)estimation.Let ?t =Xt-E(Xt|Ft1),?t = E(?l2|Ft1),then?t=Var(Xt|Ft1).We can derive the WCLS estimations by minimizing Q(?,u)=N?t=p|1 2?l/?t,where ? =(?1,?2,…,?p)T and ?j=?(?)j,where(?)Replacing ?,(?)j and ? with the corresponding consistent estimates obtained by other means,such as the CLS estimates or the Yule-walker estimates(see Ristic and Nastic(2012)),the estimated versions of ?l is denoted by ?l,we can minimize the following sum of squares(?)Solving the corresponding linear system,obtained by equating partial derivatives of Q(?,?)to zero,it is easy to get thatSolving the preceding system of equations,the WCLS estimators of parameter ?since(?)?j=? and(?)j=?j/?,using the above results,the WCLS estimators of parameter ? and(?)j,j=1,2,…,p can be calculated respectively.The MQEs for the MGINAR(p)model can be based on the p-dimensional stochas-tic process {Xt,Xt2,…,Xtp}.The estimating equations are given by we know and where Rm=Xtm-(?)Cmk(?)(Stj?t(j)o Xt j)m-k ?tk+?tm.{Xt} is Markov process on No.For {?s,s?t} and {Stj,j=1,2,…,p;s?t}and {?t(j)o Xs,s?t}are independent,we have E(Rm|Ft 1)=0,then where ?i=(-1)i?k1k2…ki,i=1,2,…,m-k-1,?k1k2…ki is the sum of any i umbrs of 0,1,…,m-k-1 and ?1?B(Xt-1,?),?2?Ge(?/1+?).Taking expectations,we have(?).The quantities E(?tk),(k=1,2,…,m)can be obtained.Now we can obtain E(Xtm|Ft-1),m=1,2.…,2p.The MQEs ? and?(?),(?)2…(?)p can can be obtained by solving the nonlinear system of equations using the iterative method.Since the partial derivatives of these estimating functions are too complex for algebraic manipulation,the use of numerically evaluated counterparts is recommended when using a method such as Newton-Raphson iteration for the solution of this system.In the third part,we present the Bayesian estimation of the integral autoregressive rImodel based on the negative binomial thinning operator in order to study the Bayesian statistical properties of the integral autoregressive model based on the negative bino-mial thinning operator.The simulation studies of the bias and mean square errors of Bayesian estimation are given and compared with the conditional least squares estima-tion and Yule-Walker estimation and the example analysis and model prediction are given.Definition 3.Integer-valued autoregressive model based on negative binomial thinning operators:where ?,??(0,1).(?)where ? E(0,1),X is nonnegative integer-valued random variable,{Wj} is i.i.d.Bernoulli random variable sequence with Geometric(?/(1 + ?))and independent with X,P{Wj = k)= ?k/(1 + ?)k-1,k? No.Ristic et al.(2009)indicated that?Xt} is stationary random variable sequence,and the marginal distrbution is Geometric(?/(1 +?))with(?).With the squared loss function(?)we can get the Bayesian estimation,where ? is the Bayesian estimation of a.Let X0,X1,…,XT is a set of sample observations of the model,the joint likelihood func-tion of the sample is where where i = 1,2…,T.The prior distribution of a is uniform distribution on(0,1).According to the Bayesian principle,the posterior distribution of a isTheorem 2.Let X0,X1,…,XT is the sample of model 3,under the quadratic loss function and uniform prior distribution,the Bayesian estimation of a isIn the fourth part,the parameter estimation problem of mixed integer auto-regression model with missing data is studied,and the conditional least square esti-mation method without interpolation of missing data,the mean interpolation method of missing data and the bridge interpolation method are given.Finally,the simulation study is conducted.Considering the no interpolation conditional least squares estimation method of missing data,the parameter estimation problem of time series model under missing data can be solved by the idea of least squares estimation,that is,the parameter estimation can be obtained by minimizing the total sum of squares Q(?)of the following where E(Xt)=?,?=(??),sample size is n,Xk1 is the first missing data,Xk2 is the second missing data,Xkm is the last missing data,there have m unmissing data,n-m missing data.Considering the mean interpolation method of the missing data,the steps are as follows.1.Observe the missing condition of data,take the observed non-missing data Xk1,Xk2,…Xkmas object,and calculate the data expectation;2.If the desired value is non-integral,the expected value is rounded to the integer value and used as the interpolation value of the missing data;3.The interpolated data as the complete data X1,X2,…,Xn,used the standard statistical inference method to estimate the parameters of the model.Considering the bridge interpolation method of missing data,the steps are as follows.Assumes that the sample size for n,Xt is the first unmissing data,Xt+k is the second unmissing data,there are k-1 missing data in Xt and Xt+k.1.Based on the(MINAR(p))model,from the parameter space(?)selection of initial value ?0,2.In the parameter value to ?0 Xt and known conditions,according to(MINAR(p))recursive equations of model type can generate optional value Xt+1,in according to Xt+1 value simulation to generate the second optional Xt+2,similar can simulate Xt+k.3.If Xt+k?Xt+k,remain interpolation value Xt+1,Xt+2,...,Xt+k-1,otherwise repeat step 2,until Xt+1,Xt+2,…,Xt+k-1.4.Use the second step to interpolate all the missing data and generate a new complete data set after interpolation.5.Perform parameter estimation.If the estimation reaches the agreed convergence standard,stop;otherwise,repeat steps 2,3 and 4.We consider the parameter estimation under these four processing methods when the first-order mixed autoregressive model has missing data through numerical simu-lation.
Keywords/Search Tags:Mixed integer-valued autoregressive model, Empirical likelihood, Weighted conditional least squares estimation, Maximum quasi-likelihood estimation, Missing data
PDF Full Text Request
Related items