Font Size: a A A

Distributed Statistical Inference Of Expectile Regression For Massive Data

Posted on:2024-04-02Degree:MasterType:Thesis
Country:ChinaCandidate:S W ChenFull Text:PDF
GTID:2530307067991479Subject:Statistics
Abstract/Summary:
With the development of science and technology,the scale of data available for analysis is getting bigger and bigger,triggering the concept of "massive data".In the era of massive data,the traditional statistical analysis and computational methods have been challenged.Existing massive data analysis algorithms mainly include subsampling algorithms,online updating algorithms and distributed algorithms.Subsampling algorithm may cause a waste of information.Online update algorithm is suitable for streaming data.We consider distributed algorithm in the text.With the increasing of data volume,data becomes more and more complex,nonlinear and heterogeneity characteristics become more and more obvious,and data analysis tools such as mean model are difficult to meet the requirements.Therefore,we consider Expectile regression model in this paper.Compared to quantile regression,the loss function of Expectile regression is accessible,and the estimation of the asymptotic covariance doesn’t require the calculation of error density function,giving it some computational advantages.In addition,the Expectile model is more sensitive to tail information,focusing not only the probability of the tail but also the value of the tail.Many excellent properties make it widely used in many fields.For example,in the field of financial risk measurement,Expectile-based EVaR can better measure the risk of low occurrence probability but huge loss.This paper studies the problem of distributed parameter estimation and statistical inference of linear Expectile regression model.Based on the traditional ALS estimator,combining estimating equation and meta-analysis method,we propose the distributed algorithm based on generalized method of moments(GMM),fast distributed algorithm based on weighting equation and distributed algorithm based on Meta confidence distribution.We also give the asymptotic normality of the estimator under the heterogeneous distributed system,indicating that the algorithm converges to the parameter truevalue by O((?)).Data simulation experiments show that the DC estimator based on the weighting equation is stable under multiple data generation models,which is comparable to the traditional ALS estimator.For heterogeneous data,the CD estimator based on Meta confidence distribution has the best effect,far beyond the traditional ALS estimator.Finally,the birth rate data were used to explore the effects of demographic characteristics and maternal behavior on neonatal weight,thus providing reasonable and effective advice to avoid the emergence of low body weight patients.
Keywords/Search Tags:Distributed Algorithms, Expectile Regression, Meta-Analysis, Generalized Method of Moments, Heterogeneous Distributed System
Related items