Font Size: a A A

The Study And Applications Of Structured Regularization For High-dimensional Data

Posted on:2022-09-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:W L XieFull Text:PDF
GTID:1488306536961249Subject:Statistics
Abstract/Summary:PDF Full Text Request
High-dimensional data analysis is characterized by either a large amount of patterns or a high dimensionality,or both.Regularization is of critical importance for high-dimensional problems and widely used in statistics,signal processing,matching learning and artificial intelligence,etc.A large amount of attentions has been paid to sparse regularization,which aims to find a tradeoff between a good fit and a sparse solution.The recent development of new techniques leads to high-dimensional and complex data sets,which motivates the use of more structured types of regularizations.Recent years have witnessed many structured regularization methods for high-dimensional problems.The structural information,when captured in a regression model,enables to improve the prediction accuracy and model interpretability.Despite this encouraging progress,there remains a lot of work to be done.As such,on the basis of former research,this paper tends to continue probing into the further research of structured regularization for high-dimensional data.To be specific,the main work of this paper can be summarized as follows:Chapter 2 studies the structured regularization with additional constraints on the coefficients in sparse high-dimensional linear models.Since nonnegative constraints are simple and widely used in many applications,we propose the nonnegative hierarchical lasso with nonnegative constraints on the coefficients,which is capable of simultaneous selection at both group and within-group levels,namely bi-level selection.We study the theoretical properties of nonnegative hierarchical lasso both in the low-dimensional and ultra-high dimensional settings.In addition,a fast iterative half thresholding-based local linear approximation algorithm(IHT-LLA)is proposed for implementation.Finally,simulation studies as well as an application in index tracking are conducted to show the performance of the nonnegative hierarchical lasso as compared to other nonnegative methods.Chapter 3 and chapter 4 mainly focus on the development of square-root based structured regularization.The least square-based procedures all suffer from the fact that the respective optimal values of tuning parameter?depend on the noise level?,and the accurate estimation of?when the dimension p is large may be as difficult as the original problem of selection.By using the square-root loss function instead of the squared loss function,the square-root regularization can achieve optimality for a tuning parameter independent of the noise level?.This makes square-root regularization more attractive when the dimension p is large,especially when p(?)n.Combining the square-root loss function and a group elastic-net penalty,i.e.a(?)2,1+(?)2 penalty,Chapter 3 proposes a new square-root regularization method called Group Square-Root Elastic Net.In theoretical analysis,we study the correct subset recovery under a Group Elastic-net Irrepresentable Condition.Both the slow rate bounds and fast rate bounds are established,the latter under a Restricted Eigenvalue assumption.To implement,a fast algorithm based on the scaled multivariate thresholding-based iterative selection idea is introduced with proved convergence.A comparative study examines the superiority of our approach against alternatives.When applying the square-root regularization to the piecewise-smooth signals,a novel method called Structured Smooth Adjustment for Square-root Regularization is proposed in chapter 4 to simultaneously select grouped variables and encourage piecewise smoothness within each group.We show the estimator can achieve optimal estimation and prediction,which neither relies on the knowledge of the standard deviation?of the error term nor pre-estimates?,under some mild conditions on the design matrix.To implement,the algorithm termed Scaled Dual Forward-backward Splitting is proposed with proved convergence.Finally,we carry out an experimental evaluation on both synthetic data and real data obtained from glioblastoma multiforme samples and gray images.
Keywords/Search Tags:Linear models, Structured regularization, Structured sparsity, Square-root regularization, Nonnegative constraints
PDF Full Text Request
Related items