Font Size: a A A

Modelling And Statistical Inference For Some Integer-valued Autoregressive Processes

Posted on:2021-02-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y KangFull Text:PDF
GTID:1360330623477305Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
In practice,count data defined on infinite range is commonly encountered.Poisson INAR(1)process is a common model to fit this kind of data.However,the mean and variance of the Poisson distribution are equal,which leads to the result that the Poisson INAR(1)process is not suitable to model overdispersion and underdispersion.To overcome this difficulty,researchers have proposed some evolutionary INAR(1)models.To better model overdispersed and underdisersed count data,this paper propose GSC thinning operator and GSCINAR(1)process.Count data with finite range is also sometimes suffered.For example,weekly num-ber of rainy day in Amazon rainforest,weekly number of districts in Germany with at least one new case of measles.The most suitable way to fit the data is the BAR(1)model.However,the BAR(1)model has two main limitations:(?)the BAR(1)model fails to capture some data characteristics such as zero inflation,binomial overdisoer-sion and binomial underdispersion;(?)the BAR(1)model is not suitable to explain some practical problems such as job market.To solve the above two problems,this article propose a mixed binomial autoregressive process and a generalized binomial autoregressive processIn actuarial science,individual risk model is proposed based on the assumption that the premium and claim size are independent.But in practice,the dependence between premium and claim size are commonly observed.So we consider the positive dependence relation into the model,propose the dependent individual risk model and study the statistical properties and risk measures for the model1.Statistical inference for the GSCINAR(1)processFor count data defined on infinite range,we propose a new thinning operator and INAR(1)process to better model overdispersed and underdisersed count dataGomez-Deniz et al.(2011)introduced a discrete distribution taking non-negative integers {0,1,...}.For convenience,we call it the Gomez-Deniza-Sarabia-Calderin-Ojeda(GSC)distribution,i.e.,where ?<1,??0 and 0<?<1.The moments and the moment generating function of the GSC distribution are given,as followsBased on the GSC distribution,we propose the GSC thinning operator where {Wj} is a sequence of i.i.d.GSC(?,exp{-|?|})random variables,{Wj} and X are independent.Based on the GSC thinning operator,we introduce the GSCINAR(1)process,as follows:Definition 1 {Xt}t?Nis an INAR(1)model based on the GSC thinning operator,defnoted by GSCINAR(1),is defined by the following difference equation:where {Wj} is a sequence of i.i.d.GSC(?,exp{-|?|})random variables with the finite mean ? and variance ?,?<1,??0.Here,we write ?=1/log(1-?)?s=1?log(1-? exp{-s|?|})and ?=1/log(1-?)?s=1?(2s-1)log(1-?exp{-s|?|})-?2.{?t} is an innovation sequence of i.i.d.non-negative integer-valued random variables,?t is uncor-related with the past values of {Xs}s<t.Let ??=E(?t),??2=Var(?t)(we assume that they exist).The existence of the strict stationary and ergodic GSCINAR(1)process can be established in the following theoremTheorem 1 If 0<?<1,then there exists an unique strictly stationary integer-valued random series {Xt}t?N satisfyingCov(Xs,?t)=0 for s<t.Furthermore,the process is an ergodic process.We use the conditional least square(CLS),weight conditional least square(WCLS)and modified quasi likelihood(MQL)methods to estimate the model parameters.The following theorems give the asymptotic distribution of the estimates.For convenience.write whereV11=E[{?(X0)(?X0)(X1-?X0-??)}2],V22=E[{?(X0)(X1-?X0-??)}2],V12=E[?2(X0)(?X0)(X1-?X0-??)2],H11=E[?(X0)(?X0)2],H12=E[?(X0)(?X0)],?(·)is a weight function.It can be verified that Hw is a invertible matrixTheorem 2 Suppose E|Xt|4<?.For n??,we havewhere VCLS and HCLS are given by and H?,with ?(X0)=1.Theorem 3 Suppose E|Xt|4<?.For n??,we havewhere VWCLS and HWCLS are give'n by V?,and H?,with ?(X0)=1/(X0+c1)Theorem 4 Suppose E|Xt|4<?.For n??,we havewhere VMQL and HMQL are given by V?,and Hm,with ?(X0)=V?-1(X1|X0).By simulation studies,we find that the MQL method is better than the CLS and WCLS methods.Finally,underdispersed and overdispersed real data examples show that the proposed model has better performance by comparing with the existing INAR(1)models.2.Statistical inference for the mixed binomial autoregressive processTo model the zero inflation,binomial underdispersion,binomial overdispersion characteristics of the count data with bounded support,we introduce a new mixture binomial autoregressive model based on the binomial thinning operator and Pegram operator.The Pegram operator is defined belowDefinition 2 Let U afnd V are two independent discrete random variables.Pegra'm operator fmixes U afnd V with the weights ? and 1-? as the marginal probability function of Z isIn the following definition,we propose our new modelDefinition 3 Let ?,?,??(0,1).Fix n ? N and the initial value of the process X0?{0,1,...,n}.Then the BAR(1)model based on binomial thinning operator and Pegram operator is defined by the recursion:where ? and*are the binomial thinning operator and Pegram operator,respectively.The random variables ? ? Xt-1 and ? ?(n-Xt-1)are independent of each other when Xt-1 is given,defnote {Xt}t?N by MPTBAR(1)model.We state the strict stationarity and ergodicity of the MPTBAR(1)model in the following theoremTheorem 5 Suppose the process {Xt}t?N is the MPTBAR(1)process defined by(1),then{Xt}t?N is an irreducible,aperiodic and positive recurrent(and thus ergodic)Markov chain,there exists a strictly stationary process satisfying(1).Estimators of the model parameters are derived by the conditional maximum likelihood(CML)method,the following theorem gives the asymptotic properties of the estimator.Theorem 6 The CML estimnator of the MPTBAR(1)process is cofnsistent and is also asymptotically normally distributed,I(?)is the Fisher information matrix.3.Statistical inference for the GBAR(1)processWe propose the GBAR(1)model for the limitation of the BAR(1)model in ex-plaining some actual backgrounds such as job marketFirst,we give the definition of the generalized binomial thinning operatorRistic et al.(2013)introduced a sequence of random variables {Ui}i?N defined as where {Wi}i?N and {Vi}i?N are two independent random sequence of i.i.d.random variables with Bernoulli(?)and Bernoulli(?)distributions,Z is a Bernoulli(?)random variable,where ??[0,1]and ??[0,1].Based on the above random sequence,we give the definition of generalized binomial thinning operatorDefinition 4 Let {Ui}i?N be a sequence of random variables defined in(2).The generalized binomial thinning operator "?(?)?",?,??[0,1],is defined as where X is a non-negative integer-valued random variable.Next theorem gives the definition of the GBAR(1)processDefinition 5 The GBAR(1)process {Xt}t?N is defined by the recursion:where n?N,?:=p(1-?),?:=?+?,p ?(0,1)and ??(max{-p/1-p,-1-p/p},1).The strict stationarity and ergodicity of the GBAR(1)model is stated in the following theoremTheorem 7 Suppose the process {Xt}t?N is the GBAR(1)process defined by(3),then is an irreducible,aperiodic and positive recurrent(and thus ergodic)Markov chain,there exists a strictly stationary process satisfying(3).Three different methods of parameter estimation,namely,CML,CLS and MQL are considered to estimate model parameters.The asymptotic results for estimators are derived.Next theorem states the asymptotic properties of the CML estimate of ?.Theorem 8 The CML estimator of the GBAR(1)process is consistent and is also asymptotically mormally distributed,I(p,?,?)is the Fisher information matrix.Next theorem states the asymptotic properties of the CLS estimate of(?,p)'.Theorem 9 Suppose that(?CLS,pCLS)' are the CLS estimators for the GBAR(1)model parameters(?,p)',for T??,we have where?12=E[(np-X1)2(X2-?X1-n(p-p?))2],?22=E[n2(?-1)2(X2-?X1-n(p-p?))2],?12=E[n(?-1)(np-X1)(X2-?X1-n(p-p?))2],V11=E(X1-np)2,V12=V21=n(l-?)E(X1-np),V22=n2(1-?)2.Next theorem states the asymptotic properties of the CLS estimate of ?.Theorem 10 Suppose that ?CLS is the CLS estimator for the GBAR(1)model parameter ?,for T??,we havewhere J=E[c1(2?X12-2?X1)+c2(2?(n2-2nX1+X12)-2?(n-X1))]2,D2=E{[c1(2?X12-2?X1)+c2(2?(n2-2nX1+X12)-2?(n-X1))]2×(cx2-?X1-n-(p-p?))2-[c1{?2X12+(1-?2)X1)+c2(?2(n2-2nX1+X12)+(1-?2){n-X1))])2},c1=(p-p?+?)(1-p-p?-?)and c2=(p-p?)(1+p?-p).Next theorem states the asymptotic properties of the MQL estimate of(?,p)'.Theorem 11 Suppose that(?MQL,pMQL)' are the MQL estimators for the G-BAR(1)model parameters(?,p)',for T??,we have whereNext theorem states the asymptotic properties of the MQL estimate of ?.Theorem 12 Suppose that ?MQL is the MQL estimator for the GBAR(1)model parameter ?,for T??,we have where R=E(V?-1(X2|X1)[c1(2?X12-2?X1)+c2(2?(n2-2nX1+X1)-2?(n-X1))]2),K2=E{V?-2(X2|X1)[c1(2vX12-2?X1)+c2(2?(n2-2nX1+X12)-2?(n-X1))]2×((X2-?X1-n(p-p?))2-[c1(?2X12+(1-?2)X1)+c2(?2(n2-2nX1+X12)+(1-?2)(n-X1))])2},c1=(p-p?+?)(1-p-p?-?)and c2=(p-p?)(1+p?-p).We compare three estimation methods by simulation studies,we find that the CML method performs the best when the data is clean and the MQL method gives the best performance when the contaminating data exist.Finally,a real-data example shows that the performace of the model is statisfactory.4.Risk measure for the individual risk model based on copulaThe classical individual risk model supposes that the premium and claim size are independent.But in practice,the dependence between premium and claim size are commonly observed.So we improve the model and propose the dependent individual risk model:where u>0 is the initial surplus.Xi,i=1,2,...,n,are the premiums of the ith insurance contract holder in a policy year and they are independent identically dis-tributed.Ii,independent of Xi and Yi,are independent Bernoulli random variables with parameter p.Ii=1 means that there is a claim on the ith contract holder and the claim size is Yi.On the contrary,Ii=0 means that there is no claim on the ith contract holder.Yi are the claim sizes of the ith insurance contract holder in a policy year when there is a claim(Ii=1)and they are independent identically distributed n is a fixed constant representing the number of the insurance policies in the portfolioFurthermore,we propose that the fixed number of insurance policies n in(4)can be replaced by a nonnegative integer random variable N,so we further consider the following model where N follows power series distributionThe following theorems give the risk measure for the model under different depen-dence assumptions,denote the net loss of U1 by L1=Y1I1-X1.Theorem 13 Suppose Un is the individual risk model defined by(4)and(Xi,Yi)follows bivariate exponential distribution,then at a level ?,1-?<?<1,the loss probability ?,VaR and TVaR for U1 are whereTheorem 14 Suppose Un is the individual risk model defined by(4),Xi and Yi follow exponential distribution and the dependent structure is described by a bivariate FGM copula,then at a level ?,1-?<?<1,the loss probability ?,VaR and TVaR for U1 are whereTheorem 15 Suppose Un is the individual risk model defined by(4),Xi and Yi follow mixed exponential distribution and the dependent structure is described by a bivariate FGM copula,then at a level ?,1-?<?<1,the loss probability ?,VaR and TVaR for U1 are whereThe next theorem gives the risk measure of Un.Denote the net loss of Un by Ln=?i=1n(YiIi-Xi),FL1(n)(s)and fL1(n)(s)represent the cumlutive distribution function and probability density function of the convolution of the n independent net losses L1,FL1(0)(s)=1 and fL1(0)(s)=0.Theorem 16 Suppose Un is the individual risk model defined by(4),then at a level ?,1-?<?<1,the loss probability ?,VaR and TVaR for Un areThe next theorem gives the risk measure of UN.Theorem 17 Suppose UN is the individual risk fmodel defined by(5),then at a level ?,1-?<?<1,the loss probability ?,VaR and TVaR for UN areFinally,we give the numerical results for the risk measures when the number of policies n=1 and n=2.The normal approximation method is applied to evaluate the risk of the model when the number of policies is large or replaced by a power series random variable N.
Keywords/Search Tags:Integer-valued time series, thinning operator, INAR(1) model, BAR(1) model, parameter estimation
PDF Full Text Request
Related items