Font Size: a A A

Robust Variable Selection And Application Of Partial Linear Models With Functional Data

Posted on:2024-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:H WuFull Text:PDF
GTID:2530306917990299Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the advent of the Big Data era,computer technology has been rapidly developed and applied,and people have gradually come into contact with various types of data with functional characteristics(referred to as functional data),but the traditional analysis methods do not perform well in building models to deal with such data.Therefore,in recent years,many scholars have conducted various research on how to expand the analysis methods of functional data.Most of the theories focus on how to build functional linear models under the premise of screening variables,but in practice,only considering linear relationships cannot fully explain the interaction between covariates and response variables.The estimation of models is therefore fraught with instability.Therefore,the development of robust analytical models for functional data within the framework of variable selection has become one of the hottest studies in statistics today.In this thesis,two robust variable selection methods are proposed for semi-functional partially linear models containing both functional and scalar data,considering the non-linear relationship between functional data and the explanatory variables,and the main work carried out and results obtained are as follows:Firstly,this proposes robust variable selection for semi-functional partially linear models based on minimum absolute deviation regression combined with functional-type non-parametric kernel regression.Regarding the idea of transforming a partially linear model into a linear model,the estimation of regression coefficients and variable selection are carried out simultaneously through the least absolute deviation regression combined with the SCAD penalty function,finally,the estimation of the non-parametric operator is obtained through functional-type non-parametric kernel regression.Asymptotic properties of the model estimates under some canonical conditions are given in this thesis.It is also shown that the method has good robustness and accuracy in the selection of variables by comparing it with the least squares regression method in numerical simulations.Secondly,recognizing that the first method is essentially a 0.5 quantile regression,and that quantile regression with a single quantile point has limitations and nonparametric kernel regression has boundary effects,this thesis proposes a semi-functional partially linear robust variable selection based on locally weighted composite quantile regression.In this thesis,a three-stage estimation is used to give estimates of the nonparametric operator using locally weighted composite quantile regression,and then the estimates of the non-parametric operator combined with the composite quantile penalized regression are used to obtain the objective function for the variable selection of the parametric component.Secondly,the rate of convergence of the parametric component,the non-parametric operator,and the Oracle properties of the variable selection are obtained by some canonical conditions in this thesis.Finally,in a numerical simulation with finite sample size,the results obtained are compared with the first method and a three-stage locally weighted least squares regression,presenting the superiority of our proposed method.Finally,considering the increased attention to air quality issues and the great relevance of studying air quality issues to promote economic development.Therefore,this thesis proposes a study to identify the meteorological influences on air quality in Chongqing region based on the perspective of air pollutants.By observing the air pollutant data and meteorological data from 2014 to 2020,this thesis understands that the main air pollutant PM2.5 has obvious functional characteristics,therefore,firstly,the daily average PM2.5 is smoothly functionalized as a functional covariate,then six meteorological indicators such as annual average temperature,annual average humidity,and annual average wind speed are used as scalar covariates,and finally AQI is made to be the response variable A semi-functional partially linear model was established.The thesis then determines the existence of multicollinearity among the six meteorological covariates by calculating the number of conditions and therefore uses the proposed robust variable selection method to carry out the analysis.The thesis concludes with estimates of the regression parameters and the estimated curves of the non-parametric operators.From the obtained regression parameter estimates,two redundant variables are found among the six meteorological indicators,which means that our proposed method can successfully identify the important meteorological factors affecting air quality.
Keywords/Search Tags:functional data analysis, least absolute deviation regression, composite quantile regression, variable selection
PDF Full Text Request
Related items