Font Size: a A A

Statisitical Inference On Single-index Regression Modesis And Semi-varying Coefficient Models

Posted on:2012-06-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:B HeFull Text:PDF
GTID:1220330368478794Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
The single index model is a very important class of statistical model in non-parametric regression analysis, and a powerful tool to deal with multivariate non-parametric regression. It is very attractive in both the theory and the practice. It has been applied in the discrete choice analysis of econometrics and the dose-response model of biological determination (Hardle et al [28]). The important feature of such kind of models is that it can transfer a multivariate-vector into an univariate index. It not only avoids "curse of dimensionality" problem, but also still captures the important characteristics of high-dimension data. Therefore, the statistical inference on this type of semi-parametric model is an important problem in multivariate non-parametric regression.It is also the current hot issue, and the major problem in this paper.In the past period of time, a large number of articles considered the single-index estimation and non-parametric part, and focus on the (?) consistency and efficiency (see Carroll et al [4]). In the earlier. the basic and the most popular methods to estimateβand g(·) are weighted average differential and kernel estimation (see Powell et al [67]). As the research goes on, many kinds of estimation methods are used to obtain the estimators of unknown parts in the model, including the semi-parametric least squares (SLS), weighted least squares (WLS). M-estimation, kernel estimation, the quasi-maximum likelihood estimation and minimum average variance estimation (MAVE). etc. However, in practical applications the basic assumption for the errors in these models is not always appropriate, especially for some of the data which is related to the same period. the errors offen show the contemporaneous corre-lation. At this time, we would like to use to the seemingly unrelated regression (SUR) model to deal with such data problems. Meanwhile, we also consider the statistical inference of a generalized partially linear single-index model and the test of serial correlation in semi-varying coefficients model. In what follows, we introduce the main results of this paper.In Chapter 2, for the problem of the contemporaneous correlation among the equations in single-index regression models, we use a new method to es-timate the parameters and non-parametric parts, and prove the asymptotic normality of the estimators. At the same time, we show both the parametric and the nonparametric estimators more efficient than those ignoring the con-temporaneous correlation. The asymptotic normality of these estimators are established.Consider the following seemingly unrelated single-index model: where Yij’s are responses, Xij=(Xij1,…,Xijpj)τare design points,εij’s are random errors such that E(εij)=0,E(εij1εij2)=σj1j22,and E(ε1j1εi2j2)=0 when i1≠i2. The coefficientβ0j=(β0j1,…,β0jpj)τ’s are unknown parame-ter vectors, gj(·)’s are unknown functions. For the identification. we need to assume that‖β0j‖=1 and the first element of eachβ0j is positive where‖·‖is the Euclidean norm.If we ignore the contemporaneous correlation, we can estimateβ0j by the average derivative estimatorβ0j. We will takeβ0j as an initial estimator ofβ0j.then the estimators of gj and gj’ can be obtained by minimizing the weighted sum of squares where Khj=K(·/hj)/hj with K being a symmetric kernel function on R1, and hj being a bandwidth.Let gj(zj;β0j,hj)=aj and gj’(zj;β0j,hj)=bj be the resulting estimators.Based onβ0j and gj(·),we can obtain the estimators of the residuals Note that E(εij2εij2)=σj1j22, it follows thatσj1j22 can be estimated by respectively.Taking the contemporaneous correlation into account make us estimateβ0j by minimizing where Yj=(Y1j,…,Ynj)τ,Gj(β0j)=(gj(X1jτβ0j),…,gj(Xnjτβ0j))τ,∑= (σj1j22)j1,j2m=1.To minimize(2)is actually to solve a constrined nonlinear least squares problem.same as Yu and Ruppert[98]we reparameterizeβ0j.LetγJ=(γj1,…,γj,Pj-1)τand define now the reparameterized parametersγj are unconstrained.Jacobian matrix Jj of dimension pj×(pj-1)as follows, Next:we estimateγj by minimizing where Finally,we haveIn order to obtain the asymptotic behavior of the parametric estimators,we first give the following conditions:Assumption 1(i)The distribution of Xj has a compact support set Aj,j=1,2,…,m; (ⅱ) The density function Pj(·) of Xjτβj is positive and satisfies Lipschitz condition of order 1 forβj in a neighborhood ofβ0j. Further, Xjτβ0j has a positive and bounded density function pj(·) on its support Tj.Assumption 2(ⅰ) The function gj has two bounded and continuous derivatives, j=1,…,m;(ⅱ) ljs(·) satisfies Lipschitz condition of order 1. where ljs(·) is the sth com-ponent of lj(·), lj(z)=E(Xj|Xjτβj=z),1≤s≤Pj.Assumption 3(ⅰ) The kernel K is a bounded, continuous and symmetric probability density function, satisfying(ⅱ) K satisfies Lipschitz condition on R1.Assumption 4 E(εij)=0, E(εij1εij2)=σj1j22, and E(εi1j1εi2j2)=0, when i1≠i2, E(εij4)<∞, i=1,...,n,j=1,...,m.Assumption 5Assumption 6 is a positive matrix.Forβw, we have the following asymptotic property:Theorem 1 Suppose that Assumptions 1 to 6 hold, then we have the (ji,j2) entry ofΣ-1, andΛ1j=Χ1jτβ. Especially, andTheorem 2 The asymptotic covariance ofβjw in Theorem 1 is smalle than that ofβj which ignored the contemporaneous correlation.Next, we consider a two-stage non-parametric estimation.To get the asymptotic properties of nonparametric estimators, we give two other technical assumptions:Assumption 7 Assumption 8 The bandwidth hj* satisfies hj*=O(n-1/5) and nhj*4'∞as n'∞.Furthermore,hj=ο(hj*).Theorem 3 Suppose that Assumptions 1(ⅰ):2(ⅰ),5(ⅰ),7,8 hold.The we haveRemark 1 It is easy to see thatσjj2≥(σjj)-1,therefore,gjT(·)is asymp-totically more efficient than gj(·) in the sense of asymptotic variance.In Chapter 3,we consider the empirical-likelihood based inference for the pa-rameters in a generalized partially linear single-index model(GPLSIM).Based on the local linear estimators of the nonparametric parts,an estimated empir-ical likelihood-based statistic of the parametric components is proposed.We show that the resulting statistic is asymptotically standard chi-squared and the confidence regions for the parametric components are constructed.Consider the following generalized partially linear single-index model: for some given link function g(·),whereμ(x,z)=E(Y|X=x,Z=z),X= (X1,…,Xp)τand Z=(Z1,…,Zq)τ,x=(x1,…,xp)τand z=(z1,…,zq)τα0 andβ0 are unknown parameters,η0(·)is unknown univariate function.‖α0‖=1 is required for identifiability. Here,we mainly consider whenα0 is given or can be estimated at reasonable accuracy.In local quasi-likelihood,we approxiimateη0(·)locally by a linear function for v in a neighborhood of u,where a=η0(u) and b=η0(u).Let K be a sym-metric probability density function and let Kh(t)=K(t/h)/h be a recalling of K. The function K is usually called a kernel function, and the parameter h is called the bandwidth. Suppose that we have a random sample of size n, (Yi,Xi,Zi)i=1n, where Xi=(Xi1,Xi2,…,Xip)τand Zi=[Zi1,Zi2,…Ziq)τ, with the given value ofα0, the estimation procedure forβandη0(·) is as follows:Step 1 Findη(u; h,α0),βby maximizing the local quasi-likelihood with respect to a, b andβ, whereβ=(β1,β2,…,βq)τStep 2 Updateβby maximizing with respect toβ.Step 3 Obtain the final estimatorη(u;h,α0,β) by maximizing the local quasi-likelihood with respect to a and b.The estimatorβofβcan be obtained by maximizing (4) with respect toβ. This maximization may be carried out by solving the likelihood equation with this, the estimated empirical likelihood ratio statistic forβcan be defined by where Wi(β;η(α0τXi;h,α0))=(?)/((?)β)Q[g-1{η(α0τXi;h.α0)+βτZi},Yi]. By the Lagrange multiplier method,it can be shown that whereλis determined byIn order to obtain the asymptotic behavior of the parameters parts,we give the following conditions:Assumption 9 The function q2(x,y)<0 for x∈R and y in the range of the response variable.Assumption 10 The marginal density ofα0τX is positive and continuous at the point u.Assumption 11 The functionη0"(·) is continuous at the point u.Assumption 12 g"(·)and V(·) are continuous functions.Assumption 13 With U=α0τX and R=η0(U)+β0τZ,E{q12(R,Y)|U= t),E{q12(R,Y)Z|U=t}, and E{q12(R,Y)ZZτ|U=t} are continuous in t at the point u.Moreover,E{q22(R,Y)}<∞and E{q12+δ(R,Y))<∞,for someδ>2.Γ(β0)=E{q1(m1,Y1)Z1+γ(U1)v1)}(?)2 is a positive define matrix.where U1=α0τX1,m1=η0(α0τX1)+β0τZ1,γ(u)=E[ρ2{η0(u)+β0τZ}Z|α0τX=u], v1 is the first element of q1(m1,Y1)∑-1(α0τX1)(1,Z1τ)τandAssumption 14 The kernel K is a symmetric densitsty function with bounded support.Assumption 15 The random vector Z is assumed to have a bounded support. Assumption 16 The bandwidth h1 satisfies that nh14'0 and nh12/log(1/h1)'∞.Theorem 4 Suppose that Assumptions 9-16 hold.Ifβ0 is the real value of the parameter, then we haveIn Chapter 4,we discuss the test of serial correlation in semi-varying coeffi-cients model,we propose a testing method based on empirical likelihood,this method can not only test higher-order serial correlation,but also make no as-sumptions on the error distribution.At the same time.we give the asymptotic distribution of test statistics.Consider the following varying-coefficient model: where(U,X1,X2,…,Xp)are given covariates,Y is response,εis independent of(U,X1,X2,…,Xp),E(ε)=0,Var(ε)=σ2.For part of i’s,under the null hypothesis H0:αi(U)=βi, model(5)becomes a semi-varying coefficient model.defined as follows: where(U,X1,X2,…,Xp,Z1,…,Zq)are given covariates,εis independent of (U,X1,X2,…,Xp,Z1,…,zq).Consider the following semi-varying coefficient model: where{(Uk,Xk1,…,Xkp,Zk1,…,Zkq,Yk),k=1,2,…,n}is the random sample from model(6),αk=(α1(Uk),α2(Uk),…,αp(Uk))τis a vector of un-known coefficient functions.εk is stochastic error and satisfies the following AR(d) model: or MA(d)model where {ek} are independent and identically distributed random variables,and satisfy E(ek)=0,Var(ek)=σ2<+∞.ai,i=1,2,…,d are unknown autoregressive coeffcients or moving-average coefficients.and AR(d)model satisfies stationary condition.i.e. the roots of a(u)=1-a1u-a2u2 adud=0 are out of the unit circle.Define aτ=(a1,a2,…,ad),our testing problem can be presented as the following alternative hypothesis: Denoteγi=Eεtεt+i,i=0,1,…,d,andγτ=(γ1,γ2,…,γd),defineFor AR(d)model,we have the following Yule-Walker equation: From time series analysis theory we know tha a=Γd-1γ,then a=0 equals toγ-0.For MA(d)model,we have: obviously,a1=a2=…=ad=0 equals toγ1=γ2=…=γd=0. Hence,no matter for AR(d)model or MA(d)model.the testing problem can be transfered to test the following alternative hypothesis: Let wk1=εkεk+1=(Yk-αkτXk-βτZk)(Yk+1-αk+1τXk+1-βτZk+1),wk2=εkεk+2=(Yk-αkτXk-βτZk)(Yk+2-αk+2τXk+2-βτZk+2),…,wkd=εkεk+d= (Yk-αkτXk-βτZk)(Yk+d-αk+dτXk+d-βτZk+d),k=1,2,…,n-d,and Wkτ=(wk1,wk2,…,wkd).Under the null hypothesis,we have E(Wk)=0. Therefore.to test whether the sequence of errors are correlated equals to test whether E(Wk)are zero.Denote T=n-d. Under the null hypothesis E(Wk)=0,we obtain the empirical likelihood ratio function as follows: where the vectors of unknown functions are replaced withβandαk,then we get the estimated empirical likelihood ratio function: By the Lagrange multiplier method,we obtain whereλis the solution of the following equation: we put it into the empirical likelihood ratio function.and get the log-empirical likelihood ratio function:In order to obtain the main result,we require the following conditions:Assumption 17 The random variable U has a compact support set D. Its density function f(·)is Lipschitz continuous and bounded away from 0 on its support.Assumption 18 The p×p matrix E(XXτ|U) is non-singular.E(X|U), E(XXτ|U),E(XXτ|U)-1,E(XXτ*XXτ|U),E(XZτ|U) are all Lipschitz con-tinuous.Assumption 19 The function K(·)is a symmetric density function with compact support.Assumption 20 The bandwidth h'0,nh'∞and nh5=ο(1).Assumption 21αi(·),i=1,2,…,p have continuous second derivative.Assumption 22 There is an s>2 such that E‖X‖2s<∞,and E‖Z‖2s<∞,and for someε<2-S-1 such that n2ε-1'∞.Under these conditions,we have the following conclusion:Theorem 5 Under the null hypothesis and Assumption 17-22,then we have...
Keywords/Search Tags:single index model, semi-varying coefficients model, empirical likelihood, confidence region, nonlinear least squares estimation, quasi-likelihood equation, serial correlation
PDF Full Text Request
Related items