Font Size: a A A

Statistical Inference For Integer-valued Multinomial Autoregressive Processes

Posted on:2021-01-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:1360330623477306Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Integer-valued time series data are widely used in the process of practice and science.They can be divided into two categories:one type of data belongs to Z,such as the number of global earthquakes,the number of insurance claims and so on.Modelling for this type of data can be traced back to the first-order integer-valued autoregressive process(McKenzie(1985),Al-Osh and Alzaid(1987))and the geometric first-order integer-valued autoregressive process(Ristic et al.(2009)).Another type of data,also the main part of this article,is the finite-range integer-valued time series data.Typical examples are weekly number of rainy day,medicine monitoring and instrument testing,etc.In order to process this type of data,McKenzie(1985)proposed the first-order binomial autoregressive(BAR(1))process.It should be noted that BAR(1)process always consider the data sets with two states,which cannot describe the data sets with multi-states effectively.In this paper7 we mainly studied the integer-valued multinomial autoregressive process to describe the data set changing between three states.The main content is divided into three parts.In the first part,we reviewed the first-order random coefficient integer-valued autoregressive process,and extended it to the RCINAR(1)process with generalized negative binomial marginals.Then we considered the effectiveness and robustness of estimation methods through simulation,and fitted a set of real data using the proposed model.The second part belongs to the main content of this article.For finite-range integer-valued time series data with three states,we proposed a first-order integer-valued multinomial autoregressive process.We also proved the strict stability and the ergodicity of MAR(1)model.The related basic probability statistics properties and the estimation methods have been considered in this part.Finally,we used the model to fit a set of real data.Comparing the fitting results with other bivariate binomial autoregressive models,we can illustrate the excellence of our model.In the third part,based on the MAR(1)process,we considered the cases where the parameters were affected by other factors.A first-order random coefficient integer-valued multinomial autoregressive process and a covariates-driven first-order integer-valued multinomial autoregressive process were proposed for this situation.We also considered the basic probability statistics properties and the estimation methods for the corresponding models.The effectiveness and the excellence of the extension of random coefficient were illustrated through a real data set.In what follows,we introduce the main results of this thesis.1.Parameter estimation of RCINAR(1)process with generalized negative binomial marginals.For integer-valued time series data with over-dispersion,based on the binomial thinning operator,we consider the first-order random coefficient integer-valued au-toregressive process with generalized negative binomial marginals,named as GNB-RCINAR(1)process:(1)where(?){?t,t?1} is an i.i.d.sequence with NB(m(1-?),p);(?)Binomial thinning operator "o" defined as:(2){Yk} is an i.i.d.random variable sequence with B(1,?);(?)?m,t is an i.i.d.sequence with Beta(m?,m(1-?)).For t?N,?m,t is independent of ?t and {Xs}s<t.Suppose {Xt}t=1N is a sample set generating from GNB-RCINAR(1)process,?=(m,?,p)T are the parameters of interest.We consider the Yule-Walker estimation,conditional least squares estimation,conditional maximum likelihood estimation and Bayesian estimation of parameters ?.The following theorems show the corresponding asymptotic normality of estimation methods.Theorem 1 Suppose E|Xt|4<?,for Yule-Walker estimators mYW,?YW and PYW,we have:For conditional expectation,m and p are in the same term m(1-?)(1-p)/p,which implies that it is impossible to obtain the CLS estimators of m and p at the same time.Thus,we use the two-step conditional least squares method to estimate(?,p)T and m,respectively.Theorem 2 Suppose E|Xt|4<?,the CLS estifmators(?CLS,pCLS)T are asymp-totically normal as:Theorem 3 Suppose E|Xt|6<?,the CLS estifmator mCLS has asymptotical normality,i.e.The asymptotic normality of CML estimators is similar to Franke and Seligmann(1993),which is the special case of Theorem 2.1 and Theorem 2.2 in Bilingsley(1961),we have:Theorem 4 Let {Xt} be generated from GNB-RCINAR(1)process.Under the assumptions(C1)to(C6),the CML estifmators ?CML=(mCML,?CML,pCML)T are strong consistent.Theorem 5 Under the assumption of Theorefm 4,the CML estifmators ?CML are asymptotically normalwhere I(?)is the Fisher information matrix.From the simulation we find that compared to the CML estimator and the Bayesian estimator of the parameter m,the Yule-Malker estimator and the CLS estimator are much worse.The reason may be that the Yule-Malker estimator and the CLS estimator of m require more information about high-order sample moments,which may increase the uncertainty of the estimation.Conversely,the CML estimator and the Bayesian estimator of m need more information about likelihood function so that the effect would be much better.Therefore,we consider the modified estimation of parameter m.The corresponding asymptotic normality is given in the following theorem.Theorem 6 Suppose E|Xt|4<?,the modified estimator mM is asymptotically normal as:The specific forms of the asymptotic distributions in the above theorems are given in Section 2.2.We compare the effects of estimation methods through simulation and consider the robustness of estimators when data has contamination.The simulation results show that the modified Yule-Malker estimator and the modified CLS estimator of parameter m is effective.Moreover,for parameters ?,the Bayesian estimation is the most robust and effective method of all.Finally,we use the model to fit the data of road accidents in Schiphol area in the Netherlands.The results show that GNB-RCINAR(1)can have a great fitting effect on this type of data.2.Modelling and statistical inference for the first-order integer-valued multinomial autoregressive process.For finite-range integer-valued time series data with three states,such as the num-ber of people switching among homosexuality,heterosexuality and bisexuality in a fixed group,the number of people in one states of risk aversion,risk neutrality and risk preference,and the number of people in one states of cognitively normal(CN),mild cognitive impairment(MCI)and diagnosed with Alzheimer's disease(AD).We pro-pose a first-order integer-valued multinomial autoregressive(MAR(1))process,which is defined as follows:Definition 1 Let Xt=(X1t,X2t)T be a multinomial random variable.n?N is a given number denoting the upper limit for the multinomial range.?i,?i and ?i ?(0,1)for i=1,2.Z1t=?1?X1,t-1,Z2t=?1?X2,t-1,Z3t=?1?(n-X1,t-1-X2,t-1).Then the first-order integer-valued multinomial autoregressive process {Xt} satisfies the following recursion:X1t=Z1t+Z2t+Z3t,X2t=?2?(X1,t-1-Z1t)+?2?(X2,t-1-Z2t)+?2?(n-X1,t-1X2,t-1-Z3t),(3)where all thinnings are performed independently of each other,and the thinnings at time t are independent of {Xs}s<t.Proposition 1 shows the transition probability function of MAR(1)process which is used in the CML estimation of parameters.Proposition 1 The trafnsitiofn probability function of MAR(1)process is deduced as follows,(4)In the next proposition,we describe the strict stability and the ergodicity of MAR(1)process,which are used in deriving the basic probability statistics proper-ties of model and the asymptotic distributions of estimators.Proposition 2 The process {Xt} in Definition 1 is an irreducible,aperiodic and positive recurrent(i.e.ergodic)Markov chain.Hence,there exists a strictly stationary process satisfying Model(3).Then we consider the estimations of parameters ?=(?1,?2,?1,?2,?1,?2)T.For CLS estimation,we can get the asymptotic distributions of ?1=(?1,?1,?1)T and?2=(?2,?2,?2)T,respectively.Theorem 7 For {Xt} generated from MAR(1)process,E?Xt?4<?,then the CLS estifmators ?1CLS satisfy the asymptotical normality as:Theorem 8 Suppose E?Xt?4<?,then the CLS estifmators ?2CLS are asymp-totically normal as:In order to improve the efficiency of the CLS estimators,we consider using the inverse of conditional variance as weight to obtain the WCLS estimators.The corre-sponding asymptotic theory is given in Theorem 9.Theorem 9 Suppose E?Xt?4<?,for WCLS estimators ?1WCLS and ?2WCLS,we haveandBased on the transition probability in Proposition 1,the CML estimators can be obtained by maximizing the conditional log-likelihood function.The related asymptotic normality is given in the following theorem.Theorem 10 Let {Xt} satisfy the MAR(1)process,then the CML estimators?CML have asymptotic normality as:where 1(0)is the Fisher information matrix.The specific forms of the asymptotic distributions in the above theorems are given in Section 3.5.We compare the effect of the CLS estimation,the WCLS estimation and the CML estimation via simulation.The simulation results show that the inverse of the conditional variance is a satisfactory weight,and the WCLS method improves the efficiency of the CLS estimation.Moreover,the CML method is always better than other two methods.Meanwhile,the boxplots,the histograms and the QQ plots can also prove our conclusions indirectly.Finally,we use the proposed model to fit a set of monthly incomes data,and compare the fitting results with some bivariate binomial autoregressive models.The results show that MAR(1)model can well explain the finite-range integer-valued time series data with three states.3.The extended studies of the integer-valued multinomial autoregressive model.In this section,we consider the extensions of the integer-valued multinomial au-toregressive model where the coefficients are affected by other factors.First,we in-troduce a first-order random coefficient integer-valued multinomial autoregressive(R-CMAR(1))process.Definition 2 Let Xt=(X1t,X2t)T be a multinomial random variable,if they satisfy the following recursion:{Xt}t?N is a first-order random coefficient integer-valued multinomial autoregressive(RCMAR(1))process,where(?)Z1t=?1t?X1,t-1,Z2t=?1?X2,t-1,Z3t=?1t?(n-X1,t-1-X2,t-1);(?)n?N is a given number denoting the upper limit for the multinomial range;(?){?it},{?it} and {?it}?(0,1)are i.i.d.random variable sequences with cumulative distribution functions(CDF)P?i,P?i and P?i,i=1,2;(?)Let ??1=E(?1t),??2=E(?2t),??1=E(?1t),??2=E(?2t),??1=E(?1t),??2=E(?2t),??12=Var(?1t),??22=Var(?2t),??12=Var(?1t),??22=Var(?2t),??12=Var(?1t),??22=Var(?2t),note that they are all assumed finite;(?)All thinnings are performed independently of each other,and the thinnings at time t are independent of {Xs}s<t.The corresponding basic probability statistics properties,strict stability and the ergodicity are similar to MAR(1)process.For RCMAR(1)process,the parameters of interest are ?=(??1,??2,??1,??2,??1,??2)T.In the simulation we compare the CLS estimation,the WCLS estimation,and the CML estimation when the distributions of random coefficients are given.Let random coefficients {?it},{?it} and {?it} follow the power function distribu-tions PF(1,?i),PF(1,?i)and PF(1,?i),i=1,2.Therefore,the CML estimation of ?can be converted to the CML estimation of ?=(?1,?2,?1,?2,?1,?2)T.Finally,?CML can be obtained by the following transformation:The results of simulation are similar to MAR(1)process:the CML estimation is the best one of all three methods,and the WCLS estimation can improve the effect of the CLS estimation.In addition,we try to add the Logistic regression model into MAR(1)process to establish the covariates-driven MAR(1)process,called CMAR(1)process:Definition 3 The CMAR(1)process {Xt} is a sequence of multinomial random variables defined by the following equations:X1t=Z1t+Z2t+Z3t,where(?)Z1t=?1?X1,t-1,Z2t=?1?X2,t-1,Z3t=?1?(n-X1,t-1X2,t-1),(?)?1,?2,?1,?2,?1,?2?(0,1)satisfy the Logistic regression model.For parameters?i0,?i=(?i1,?i2,…,?im)T and covariates Y=(Y1,Y2,…,Ym)T,i=1,2,…,6,we have(?)Y are observable m-dimension covariates,the factors of Y are independent of each other;(?)n?N is a given number denoting the upper limit for the multinomial range;(?)All thinnings are performed independently of each other,afnd the thinnings at time t are independent of {Xs}s<t.In view of the complexity of CMAR(1)model,we only consider the CML esti-mation of parameters.Here we fixed Y and ?i0,i=1,2,…,6,and consider the estimation of parameters v=(?1,?2,?3,?4,?5,?6)T.The simulation results show the effectiveness of the CML estimation.Finally we use the RCMAR(1)model to fit the monthly incomes data of a finan-cial institution,and compare the fitting results with MAR(1)model and some bivariate binomial autoregressive models.The results show that the extension of random coef-ficient is necessary and meaningful.The fitting effect of RCMAR(1)model has been significantly improved compared to MAR(1)model.
Keywords/Search Tags:integer-valued time series, binomial thinning operator, random coefficient, multi-nomial autoregressive process, parameter estimation
PDF Full Text Request
Related items