Font Size: a A A

Research On Algorithm Of Mixed Effect Regression Model Under Massive Data

Posted on:2020-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:Q GengFull Text:PDF
GTID:2370330602462008Subject:Mathematics
Abstract/Summary:PDF Full Text Request
The rapid development of society and the development of Internet cloud computing have made us enter the era of massive data or big data.The emergence of massive data brings us opportunities as well as a series of challenges,such as storage bottlenecks and computational inefficiencies.In order to cope with the challenge,previous scholars proposed a divide-and-conquer algorithm,which is widely used in statistics.However,previous scholars only applied the divide-and-conquer algorithm to a simple statistical model under massive data.For many more complex and widely used models,such as mixed-effect regression models,they have not been studied under massive data.The mixed-effects regression model can better describe the intra-group correlation and inter-group independence between longitudinal/panel data.It is a commonly used model for analyzing longitudinal data,and has been widely studied in various aspects such as finance,economics,and medical treatment.This paper focuses on the estimation algorithm of linear mixed effect model and semiparametric mixed effect model under massive data.Based on the divide-and-conquer algorithm,a new algorithm is proposed to estimate the parameters of these two types of models.Firstly,the linear mixed effects model is studied under massive data.Since the classical maximum likelihood method or the restricted maximum likelihood estimation method of the linear mixed-effect model involves complicated steps in the estimation of the variance component,a large computational burden will occur,resulting in an excessively longcalculation time.At the same time,when the model coefficients are estimated by the weighted least squares method,the memory shortage may not be calculated because the dimension of the weight matrix is too high.In view of this,in order to avoid iterative calculation,a three-step estimation method is proposed based on the previous literature innovation.Then,in the two cases of massive data,the three-step estimation method is combined with the divide-and-conquer algorithm to propose a three-stage estimation algorithm.Simulation studies show that the three-stage estimation algorithm can solve the storage limitation problem,and can greatly reduce the calculation time and improve the operation efficiency compared with the maximum likelihood method.Finally,the proposed three-stage estimation algorithm is applied to the analysis of the influencing factors of regional GDP,the feasibility of the mixed model and its divide and conquer algorithm applied in real life is proved.This paper further studies the estimation problem of the semiparametric mixed effect model under massive data.The research is carried out in two steps:the first step is to study the estimation problem of the semiparametric regression model under massive data.The local linear least squares method is combined with the divide-and-conquer algorithm to obtain a new algorithm,and it is proved that the parameters estimated under the premise that the number of blocks is bounded has asymptotic normality.In the second step,the algorithm is extended to the semiparametric mixed effect model under massive data.The common cross-section kernel method of the semiparametric mixed effect model is combined with the divide-and-conquer algorithm,and a new estimation algorithm of model parameters is proposed.The simulation shows the feasibility of the new algorithm proposed for these two types of models under massive data,and verifies that the new algorithm has the advantages of solving the problem of insufficient memory and improving the calculation speed.
Keywords/Search Tags:massive data, linear mixed effect model, divide-and-conquer algorithm, semiparametric mixed effect model
PDF Full Text Request
Related items