Font Size: a A A

PGFR Variable Screening For Ultra-high Dimensional Partially Linear Model

Posted on:2018-04-21Degree:MasterType:Thesis
Country:ChinaCandidate:Q N LaiFull Text:PDF
GTID:2310330563952377Subject:Statistics
Abstract/Summary:PDF Full Text Request
Partially linear model is a type of semiparametric regression model.Its structure is a com-bination of linear regression models and nonparametric regression models.It not only remains the advantages that parameter regression models are easy to explain,but also remains the flex-ibility of nonparametric regression models.So the partially linear model is better to fit the actual data.With the rapid development of technology and computer sciences,the ultra-high dimensional data appears frequently in biomedicine,engineering,finance and other fields.It has become the most populate and challenging question that how to deal with and analyze these data.There are many researches on variable screening for parametric regression models in the ultra-high dimensional data,but few studies are considered for the ultra-high dimensional par-tially linear model.More studies are focused on the low dimensional settings.So the variable screening of the partially linear model is theoretical and practical significance in the ultra-high dimensional settings.In this dissertation,we consider the ultra-high dimensional partially linear model,where the dimension p of linear component is much larger than the sample size n,and p can be as large as an exponential of the sample size n.In order to choose the significant variables effectively and consider the correlation between variables according to the profiled forward regression(PFR)algorithm,we propose a profile greedy forward regression(PGFR)algorithm.From the sim-ulation studies,we can see that the profile greedy forward regression method performs very well.The main works of this dissertation includes the following two aspects:1.Firstly,we transform the ultra-high dimensional partially linear model into the ultra-high dimensional linear model based the profile technique used in the semiparametric regression.Secondly,in order to finish the variable screening for the high-dimensional linear component,we propose a variable screening method called as the profile greedy forward regression(PGFR).The proposed PGFR method not only considers the correlation between the covariates,but also identifies the relevant predictors consistently and possesses the screening consistency property under the some regularity conditions.We further propose the BIC criterion to determine whether the selected model contains the true model with probability tending to one.2.Using the proposed PGFR algorithm,we choose 2 or 4 variables as the important vari-ables at each step.Firstly,based on the numerical results of three different simulation examples,we find that the proposed PGFR method is effective on the variable screening.Comparing the proposed PGFR procedure with the existing methods,such as PFR method,sure independence screening(SIS)method and iterative sure independence screening(ISIS)algorithm,we find that the proposed PGFR method performs better for the ultra-high dimensional partially linear model with the high collinearity and the low signal-to-ratio when we choose 4 variables into the se-lected model at each step.A real example is analyzed to assess the performance of the proposed PGFR method and compare with the existing methods,such as PFR,SIS and ISIS.Finally,in the conclusion and prospect,we summarize the main research achievements and innovation acquired in this dissertation,and point out the further research issues and directions.
Keywords/Search Tags:Partially linear model, profile greedy forward regression, ultra-high dimensionality, screening consistency, variable screening
PDF Full Text Request
Related items