Font Size: a A A

Sieve Statistical Inference For Generalized Partial Linear Models And Others

Posted on:2008-08-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:X G WangFull Text:PDF
GTID:1100360212497697Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Consider the semiparametric Generalized Partial Linear Model(GPLM)where Y is the response variable, (X,T) are the explanatory variables andεis therandom error.μis a known link function.β= (β1,···,βp)T∈Rp are unknownparameters, g is an unknown smooth function.First, we consider a traditional robust estimation method, M-estimation, forGPLMs. Let T~U[0,1]. The errorεis independent with (X,T) and Eε= 0.Writeθ= (β,g)T, and call them the model parameters. Supposeθ0 = (β0,g0)T arethe true model parameters. Let A be a compact set in Rp andβ∈A.where r = m +γis the smooth parameter andγ∈(0,1]. The function set B isall the functions that are satisfied with the Ho¨lder condition. Each g∈B can beapproximated by a B-spline function with some rate. LetΘ= {θ:θ∈A ? B} be the parametric space. Write Wi = (Xi,Ti,Yi)T,i = 1,2,···,n, as the i.i.d. samplesand Wn = (W1,···,Wn)T. Forθi = (βi,gi)T,i = 1,2, define a pseudo-distance dLet Bn = {gk(t) : gk(t) = jN=1αjπj(t),maxj=1,···,N |αj|≤M3}, where {πj}N1 isthe basis of a B-spline function with l order, the dimension is N = k + l + 1,and k = O(nτ) as usual, 0 <τ< 1. For each g∈B, there existsπng∈Bn forπng?g∞≤O(k?r). ThenΘn = A?Bn can be used as a sieve forΘ. The empiricalcriterionLn(θ,Wn) = n1 i=n1ρ(Yi ?μ{βTXi + g(Ti)}),whereρ(·) is a convex loss function. Then the sieve M-estimation is required tosatisfyBecauseΘn is bounded and closed and Ln is continuous,θn must exist. Supposeθ0 is the unique minimal point of E0l(θ,W). Sinceεis independent with (X,T),E0ρ(ε) = 0. Under some conditions of GPLMs, we can have the following asymp-totic properties:Theorem 1(Strong consistency) Given some necessary conditions, d(θ?n,θ0)→0,a.s.Pθ0. If an additional condition is satisfied, thenβ?n ?β0→0, g?n ? g0 2→0,a.s.Pθ0.Theorem 2(Rate of convergence) Given some necessary conditions, d(θ?n,θ0) =Op(max (n?1?2τ,n?rτ)). Letτ= 1+12r, the rate of convergence is d(θ?n,θ0) = Op(n? 1+r2r),If an additional condition is satisfied, thenβ?n ?β0 = Op(n? 1+r2r), g?n(T) ?g0(T) 2 = Op(n? 1+r2r).Theorem 3(Asymptotic normality) Given some necessary conditions,√n(βn? Second, MLE is given when Y is type I interval censored data. Under someconditions, the asymptotic properties are obtained. When Y is type II intervalcensored data or group data, we can deal with it similarly.The density function of (X,T) isφ(x,t) and let T~U[0,1]. In type I, Y cannotbe observed directly, but subject is examined at a random time Z withδ= I{Y≤Z}.The density function of Z isψ(z).εis independent with (X,T) and Eε= 0. Itsdistribution F is known, f is continuously di?erentiable and f > 0.Write W = (X,T,δ,Z)T and its density function is,Q(w,θ) = F(z ?μ{βTx + g(t)})δ[1 ? F(z ?μ{βTx + g(t)})]1?δφ(x,t)ψ(z).The corresponding log-likelihood function is l(θ,W) = log Q(W,θ). Denote Pθasthe probability of W when the parameter isθand E0 is the expectation respectto Pθ0. Let Wi = (Xi,Ti,δi,Zi)T,i = 1,2,···,n, be the i.i.d. samples and Wn =(W1,···,Wn)T.Θn is also the sieve ofΘ. The empirical criterionLn(θ,Wn) = Pnl(θ,W) = n1 i=n1 l(θ,Wi),Then the sieve MLE is required to satisfyθn = (βn,gn)T = arg supθ∈ΘnLn(θ,Wn).Supposeθ0 is the unique maximal point of E0l(θ,W), where E0 = Eθ0. Under someconditions of GPLMs, there are the same asymptotic properties: strong consistencyand rate of convergence. In addition:Theorem 4(E?cient score function and Fisher information matrix) Givensome necessary conditions, the e?cient score function ofβis lβ? = ?D(ζ(θ0,W))(X?EJ(X|T)), Fisher information matrix I(β0) = E0(lβ?lβ?T) = C1EJ(X ? EJ(X|T))?2.The Fisher information matrix is positive definite with every component bounded.Theorem 5(Asymptotic normality) Given some necessary conditions,√n(βn?β0)→d N(0,I(β0)?1), henceβn is asymptotically normal and e?cient. B-spline functions are used to construct the sieve by reason of its local property.When the nonparametric part behaves mutably, spline functions are suitable. Whenthe nonparametric part behaves global smooth, algebraic polynomials can be usedto construct the sieve space. And we can use trigonometric polynomials, when thenonparametric part behaves periodically. Under these two cases, we have the similarconclusions.Third, we construct a new dependent measure D based on copulas. Someproperties related to useful quadrant dependence are discussed and the numericalmeasureλof D is given. We give the generalized measure in order to deal with thecomplex dependency among random vectors. Estimated by empirical copulas, someasymptotic properties of the sample measure are obtained.From Sklar Theorem, for continuous r.v.X,Y , letwhere u,v∈(0,1]. It's similar to the multiplication formula of the conditionalprobability: C(u,v) = D(u,v)·Π, whereΠ= uv is the independent Copula. Manyparametric copula families belong to the form D(u,v)·uv. As a functional coe?cientin the form of copulas, D(u,v) will play an essential role in the study of dependence.Theorem 6 If u = 1 or v = 1, then D(u,v) = 1; Wu(uv, v)≤D(u,v)≤Mu(uv, v);D(u,v) is invariant under strictly increasing transformations of continuous X andY ; the continuous r.v. X and Y are PQD(NQD) if and only if D(u,v)≥(≤)1 forall u,v∈(0,1];1. LTD(Y |X) if and only if for ?v∈(0,1],D(u,v) is nonincreasing in u;2. RTI(Y |X) if and only if for ?v∈(0,1],D(u,v) is nondecreasing in u;3. LTD(Y |X) and LTD(X|Y ) if and only if for all u,u ,v,v∈(0,1] such that 4. RTI(Y |X) and RTI(X|Y ) if and only if for all u,u ,v,v in (0,1] such that0 < u≤u≤1 and 0 < v≤v≤1, D(u,v)≤D(u ,v ).When consider the limit behavior, we have to avoidλnear∞, let 0 <ε,a,b < 1,thenObviously,λ(a,b) can only deal with the dependence over the interval [a,1]×[b,1],But this defect can be minimized by choosing comparative little a,b.Theorem 7 Ifε,a,b are given as above, thenFrom the weak convergence of empirical copula process and the continuous map-ping theorem, we can obtain the weak convergence of√n[Dn(u,v)?D(u,v)]I[a,1](u)I[b,1](v),√n[λn(ε) ?λ(ε)],√n[λn(a,b) ?λ(a,b)].For continuous (X1,···,Xk), there exists a unique C such that F(x1,···,xk) =C(F1(x1),···,Fk(xk)). Let A and B are two nonempty and unjoined subsets of{1,2,···,k} and A∪B = {1,2,···,k}. Denote XA and XB as random vectors(Xi|i∈A) and (Xi|i∈B) respectively. From Sklar Theorem,F(x1,···,xk) = C(F1(x1),···,Fk(xk)),FA(xA) = CA(FA1(xA1),···,FAp(xAp)), FB(xB) = CB(FB1(xB1),···,FBq(xBq)),where FA(xA),FB(xB) are the distribution of XA and XB respectively. LetDu(uA,uB) = CA(uCA)(CuB)(uB),where u = (u1,···,uk),uA∈(0,1]p (Here p = card(A) is the components numberthat A contains), uB∈(0,1]q,q = k ? p. The multiplication formula: C(u) =D(uA,uB)·CA(uA)CB(uB). Theorem 8 As the dependence measure of XA and XB, D(uA,uB) satisfy:1. MA(uWA)(Mu)B(uB)≤D(uA,uB)≤WA(uMA)(Wu)B(uB); 2. D(uA,uB) is invariant understrictly increasing transformations of continuous XA and XB; 3. Write CA as thecopula of XA , CB as XB. Given CA,CB, if the relation between C1(u) and C2(u)is C1 C2, then D1 D2.Theorem 9 1. LTD(XB|XA) if and only if for all xB, D(uA,uB)≥1; 2.RTI(XB|XA) if and only if for all xB, D?(uA,uB)≥1.In order to avoidλnear∞, let 0 <ε< 1,0 < a,b < 1, thenλ(ε) =Iq Ip[ CA(uA)CC(Bu()uB) +ε? 1]duAduB,λ(a,b) =[b,1]q [a,1]p[D(uA,uB) ? 1]duAduB.Theorem 10 If givenε,a,b as above, thensupu∈[a,1]p,v∈[b,1]q|Dn(uA,uB) ? D(uA,uB)|→0, a.s.;λn(ε)→λ(ε),λn(a,b)→λ(a,b), a.s., as n→∞.From the weak convergence of empirical copula process and the continuous map-ping theorem, we can obtain the weak convergence of√n[Dn(uA,uB) ? D(uA,uB)]I[a,1]p,[b,1]q(uA,uB).Forth, we consider the projection pursuit with copulas. Projection pursuit(PP)is an excellent technique for dimensional reduction, and for an object dataset, PPmeasure the interest of the projection directions through a projection index. Givenn observations of p dimension random vector X = (X1,···,Xp)T , compute theprojection matrix A = (α1,···,αk)T,2≤k≤p,αi = 1,i = 1,···,k, X can beprojected into a low k dimensional space.Consider k = 2, choose the index function as the L2 distance: where Ca1,a2(u,v) is the copula of a1T X,aT2 X. The sample version:hn(a1,a2) =where Ca1,a2(u,v) is the empirical copula. The most interesting directions (a1n,a2n)are the points maximizing the above expression. Then the strong consistency of(a1n,a2n) can be obtained:Theorem 11 Suppose (a10,a20) is the unique minimum point of h(a1,a2) andthe distributions and density functions of a1T X,a2T X have some equicontinuities,then (a1n,a2n)→(a10,a20), a.s. as n→+∞.The unique minimum point can be substituted as a set of minimum points.The above conclusions can be easily extended to the case of 2 < k≤p with k-copulas. Then the PP principal components analysis can be realized. The canonicalcorrelation analysis between X and Y can be similarly given.
Keywords/Search Tags:Statistical
PDF Full Text Request
Related items