Inference Of High-dimensional Multi-task Regression Via Hybrid Orthogonalization

Posted on:2024-01-31

Degree:Master

Type:Thesis

Country:China

Candidate:Y L Ma

Full Text:PDF

GTID:2530306932454914

Subject:Data Science (Statistics)

Abstract/Summary:

With the rapid development of economy and innovation of technology,highdimensional data is widely occurring in various fields,such as biomedical science and economics.Recently,a large number of studies have investigated the high-dimensional multi-task regression problems under the assumption of sparsity,that is,exploring the linear relationships among multiple responses and covariates in high-dimensional scenarios.In fields with high reliability requirements,such as clinical research,how to effectively quantify the uncertainty of estimation is of great concern.With the widespread application of bias correction methods in single-response inference problems,some scholars have extended them to multi-response scenarios.However,the existing methods have strict requirements on sample size and sparsity for asymptotic properties analysis.To alleviate the constraints of sample size and the number of nonzero rows of coefficient matrix,we mainly focus on the statistical inference for the unknown coefficient matrix in high-dimensional multi-task regression problems.We establish the definition of the set of important predictors using the concept of distance correlation,and then,based on the hybrid orthogonal ization vectors,we propose a new statistic which is constructed in a row-wise manner.The hybrid orthogonalization vector achieves strict orthogonalization against the column vectors corresponding to the important predictors and relaxed orthogonalization against the column vectors corresponding to the other variables.In this process,by distinguishing important variables from other variables,the proposed method eliminates the influence of important signals,thus enabling the existence of more significant signals.And also,it reduces the requirement on sample size and improves the efficiency of statistical inference.Furthermore,we prove the asymptotic normality of this method and construct confidence intervals for all elements of the unknown coefficient matrix.Numerical experiments show that the new method has significant advantages by combining the performance measures of average coverage probability and average length of confidence intervals.Finally,we apply the proposed method to Alzheimer’s disease data as well as Ovarian cancer data,and the experimental results demonstrate the practical significance in biomedicine,and further validate the effectiveness of the method.

Keywords/Search Tags:

Statistical inference, Confidence interval, Bias correction, Feature screening, Hybrid orthogonalization

Related items

1	Statistical Inference Of Odds Ratio On Binary Data In Clinical Trials
2	Selection-model-based Meta-analysis With Publication Bias Correction
3	Statistical Inference Of Generalized Half-normal Distributions Based On Constant Accelerated Life Tests Of Geometric Processes
4	Inference Of Gamma Distribution And Its Environment Factor
5	Statistical Inference For Two Classes Of Statistical Models With Missing Data
6	The Comparison Of Some Simultaneous Confidence Intervals
7	Graphical Constrained Projection Inference Approach For High-dimensional Precision Matrix
8	Study On The Theory And Application Of The Exact Post-selection Inference For Lasso
9	Statistical Inference For Estimating Equations Statistical Models With Missing Data
10	Studies On Balanced Estimation And Adaptive Projection Inference In High Dimensional Statistical Learning