Font Size: a A A

Fast,Adaptive And Selection-effective Variable Selection Methods For Artificial Neural Networks And Nonparametric Additive Models

Posted on:2023-04-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Y MaFull Text:PDF
GTID:1527306902982629Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of science and technology,researchers can collect more and more data,including high-dimensional data with many variables,such as medical image processing data,gene expression data,financial data,etc.For high-dimensional data,there is no relationship between many covariates and response variables.The addition of too many redundant variables not only increases the computational complexity,but also has a negative impact on the results of statistical inference.Therefore,it is a very necessary work to select important variables in the analysis of high-dimensional data.The variable selection problem of high-dimensional data has also become one of the hot research problems in the field of statistics and machine learning.So far,many statistical methods have been developed to solve this problem,such as LASSO,Group LASSO,etc.In recent years,the artificial neural network model is more and more favored by researchers due to its better calculation results.However,the artificial neural network model contains many parameters and requires a large number of samples to train the model.The sample size n is usually smaller than the number of variables p for highdimensional datasets.If we use artificial neural network to deal with high-dimensional data directly,the sample size is much less than the number of parameters of the model,it is difficult to get better results.Therefore,how to use artificial neural network for high-dimensional datasets is a very necessary work.In addition,there is the curse of dimensionality problem for nonparametric models to deal with high-dimensional data.There are also the main issues considered in this paper:(1)how to choose the tuning parameters easily and it has lower complexity of computation;(2)how to avoid storing the basis matrix for using less memory space;(3)how to solve the nonconvex optimization problem;(4)how to adapt the variable selection method to the characteristics of the model;(5)how to select important variables more effectively.The above problems encourage us to propose several new variable selection methods based on artificial neural networks and nonparametric additive models.The specific research work in this paper is as follows.In the second chapter,we introduce the variable selection work based on the artificial neural networks.Although artificial neural networks are not completely understood at present,their huge success in many fields make them become more and more popular.We assume the real-valued response Y (?) Rm is related to real-valued feature vector X(?)Rp1 through the conditional expectation E[Y | X=x]=μ(x),for some unknown function μ:Rp1→Γ(?) Rm.A standard fully connected ANN μwith l layers isμθ(x)=Sl Ο...Ο S1(x),where θ are the parameters indexing the ANN.For k<l,the nonlinear functions Sk(u)=σ(bk+Wku),where Wk is a pk+1 × pk matrix and bk is the bias vector.At the last layer(k=l),Sl(u)=G(c+ Wlu),where G:Rm→Γ is a link function that maps Rm into the parameter space Γ.In the above artificial neural network model,the weight matrix W1 of the first layer times the covariate x directly.If the elements of W1 which time the xj are all zero,where the xj is the j-th element in x,then we think that j-th variable is not useful for prediction.It inspires us that we can add a penalty term of W1 when we construct the optimization problem(LASSO idea).After adding the penalty term,there are two main problems encountered directly:one is how to choose the tuning parameter that controls the complexity of the model,and the other is how to solve this nonconvex optimization problem.In this paper,we use the quantile universal threshold(Giacobino et al.,2017)to select the tuning parameter,and give the specific formula of the tuning parameter.We can calculate the value of the tuning parameter according to this formula directly.Compared with the cross-validation,our method has less computation because it avoids training many candidate models.Secondly,we design an algorithm for this non-convex optimization problem,which can find the local minimum.In different regression problems and classification problems,we have done a lot of experiments to evaluate the effect of our proposed method.The results also show that our method not only has good variable selection effect,but also shows the phase transition phenomenon,which is similar to the phase transition phenomenon of LASSO under linear model.This discovery is very important,it provides a way for researchers to understand artificial neural network in the future,and also promotes further theoretical research on LASSO artificial neural network.In the third chapter,we introduce the variable selection work based on the traditional nonparametric additive model.To help detect such input entries,additive models assume that nonlinear associations may occur in all directions by approximating the underlying association with where μj’s are univariate functions.Although not dense in multivariate function space,additive models provide more flexibility than linear models.To fit a wide range of univariate functions μj,including linear and absolute value,the expansion based approach assumes that each univariate function writes as where {φk}k=1n are chosen basis functions,{βj,k}k=1n are their corresponding unknown coefficients that are estimated from the training set,which are also parameters in the model.If the coefficients {βj,k}k=1n corresponding to the j-th variable are all zero,then the j-th variable is useless for prediction.It inspires us to add a penalty term of the parameters {{βj,k}k=1n,j=1,...,p} for variable selection.After adding a penalty term,the first problem encountered is how to choose the tuning parameter that control the complexity of the model.In our work,we propose two methods for the tuning parameter selection.For the one-dimensional model,we use the method of minimizing Stein’s unbiased risk estimate to choose the tuning parameter,and give the specific formula for Stein’s unbiased risk estimate.Although the estimation of the unknown parameter σ(the standard deviation of the response variable Y)is required in this formula,a simple and effective estimate of σ can be obtained in the one-dimensional case.For high-dimensional models,we use the quantile universal threshold to select the tuning parameter,and give the specific formula of the tuning parameter,which needs less computation.Another advantage,which is relative to AMlet(Sardy and Tseng,2004),is that ours avoids estimating σ.It is difficult to get a good estimate of σ in high-dimensional cases.To solve the optimization problem,we also propose the corresponding optimization algorithm and prove the convergence property of the algorithm.In addition,it does not need to store the basis matrix,because our method uses the wavelet basis functions,which saves the memory space and can be applied to high dimensional case.We demonstrate the effect of our method through Monte Carlo experiments and real data analysis,and also demonstrate the advantages of our work via comparison with results from other methods.In the fourth chapter,we still work on variable selection based on the traditional nonparametric additive model.But unlike the work in the third chapter,we use another new idea to do select variables.The main idea is that we artificially add an error to the variable.If error is large,the result does not have a significant effect on the response variable,then the variable not has effect on the response variable.On the contrary,if a smaller error is added and it has a large effect,the variable is useful.Using this idea,we work on variable selection based on the traditional nonparametric additive model.We construct a likelihood function based on the observations with assumption errors,and maximize the likelihood function after adding some constraints to the assumption errors.For the constructed optimization problem,we propose an algorithm to solve it and select variables according to the solution.We approach the problem of variable selection in additive models from a new perspective.It allows our results to improve in terms of variable selection,especially for correlated variables.In particular,our results are better when no interaction effects are included between variables.Secondly,the screening likelihood we constructed has a univariate,nonparametric form,which allows us to effectively separate correlations from different variables.In addition,we also construct a computationally efficient estimate to ease the computational pressure.We also provide some theory results.Finally,we also evaluate the performance of our work through Monte Carlo experiments,and also compare with the results of other methods,showing our advantages and disadvantages.In the fifth chapter,we summarize our work and make plans for the future work.
Keywords/Search Tags:variable selection, artificial neural network, nonparametric additive model, quantile universal threshold, LASSO, measurement error selection likelihood
PDF Full Text Request
Related items