Quantile Regression Methods For Estimating And Identifying Latent Group Structures

Posted on:2024-04-15

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y Xing

Full Text:PDF

GTID:1527307112989169

Subject:Machine learning and bioinformatics

Abstract/Summary:

PDF Full Text Request

Extracting grouping structure or identifying homogeneous subgroups in regression has received increasing attention in recent years.For heterogeneous data,the assumption of homogeneity under classical statistical models leads to biased estimates.Therefore,it is critical to identify homogeneous subgroups from heterogeneous population.In high-dimensional data analysis,there is group structure between covariables.For example,a categorical variable can be represented by a group of dummy variables.Variable selection methods under sparsity hypothesis,such as LASSO,tend to arbitrarily select only one from each group,making the model difficult to interpret.Homogeneity is a more general assumption than sparsity,enabling us to select more variables and give information about the relationship between covariables,so as to enhance the model interpretability and improve the predictive performance.At present,a large number of literatures have developed subgroup analysis methods for identifying homogeneous group structure from a heterogeneous population.However,the penalty terms of existing subgroup analysis methods based on pairwise fusion penalties contain a large number of redundant pairwise differences of individual effects,leading to statistical and computational inefficiency.To solve this problem,we propose a method for subgroup analysis based on the median regression model to estimate and identify homogeneous subgroups for network-linked data.We use both covariates and network to identify subgroup structures from a heterogeneous population,where heterogeneity arises from unknown or unobserved latent factors.We automatically divide the sample into different subgroups by penalizing pairwise difference of intercepts for individuals connected by an edge in the network.The proposed method can also be used to predict response variables for new subjects with only covariates by taking advantage of the network reconstructed after adding these new subjects.We solve the nonconvex optimization problem based on the local linear approximation and establish the oracle properties of the proposed estimator under some regularity conditions.Our simulation studies show that the proposed method can effectively identify homogeneous subgroups.Finally,the advantages of the proposed method are further illustrated by the analysis on a real estate transaction data.Besides,methods for identifying homogeneous subgroups of regression coefficients in high-dimensional data analysis have been well studied in many literatures.However,little attention has been received to the study of sparse features.This leads to design matrices in which many columns are highly sparse,traditional statistical methods are no longer suitable.To deal with the challenges posed by sparse features,we propose a feature aggregation method based on composite quantile regression.A nonconvex pairwise fusion penalty is used to automatically detect and identify homogeneous subgroups of predictors,and predictors in the same subgroup are combined into a relatively dense latent factor.To implement the method,we propose an efficient algorithm based on the alternating direction method of multipliers framework,and establish the oracle property of the proposed estimators under some regularity conditions.Both simulation results and real data analysis show the effectiveness of our proposed method.

Keywords/Search Tags:

Homogeneity, Heterogeneity, Group Structure, Quantile Regression, Oracle property, Alternating Direction Method of Multipliers

PDF Full Text Request

Related items

1	Priori-based Image Restoration Via Weighted Schatten-p Norm Minimization
2	Partial Linear Single-index Model For Quantile Regression Based On Integration Analysis
3	A Promotion Of Education Quality Benefits "Whom"
4	An Experimental Study On Football Teaching In Senior High School Of Three Groups Of Homogeneity,Heterogeneity And Homogeneity Heterogeneity
5	Localized Low-Rank Promoting For Image Denoising Based On Robust Principal Component Analysis Method
6	Spatial-temporal Models Based On Quantile Regression
7	Study On Human Capital Effects Based On Bayesian Quantile Regression Models
8	Renewable Quantile Regression Estimation Methods For Large Scale Streaming Datasets
9	Research On The Relationship Between Financial Decentralization And Economic Growth Based On Panel Quantile Regression
10	Model Specification Tests Of Quantile Regression Models