Effective Statistical Analysis Of Complex Functional Data | Posted on:2022-11-10 | Degree:Doctor | Type:Dissertation | Country:China | Candidate:H Liu | Full Text:PDF | GTID:1520306347951679 | Subject:Mathematical Statistics | Abstract/Summary: | PDF Full Text Request | When a variable is measured or observed at multiple times,the variable can be treated as a function of time.The variable is therefore called functional variable,and the data for the variable are called functional data.The functional data are often functions of time,but may also be functions of spatial locations,wavelengths,etc.In recent years,functional data analysis has received considerable attention in many applied fields such as in clinical,biometrical,epidemiological,social and economic fields.Functional regression models the relationship among functional and scalar variables,and is widely used in functional data analysis.In the existing literature about functional regression,we can divide them into three categories depending on whether the responses or covariates are functional or scalar data:(ⅰ)functional responses with functional covariates;(ⅱ)scalar responses with functional covariates;and(ⅲ)functional responses with scalar covariates.With the increase in the complexity of real data,some new characteristics of functional data have emerged,including large-scale,dynamic,interact effect,and so on.Thus,it is necessary to propose new models and methods.In this dissertation,we study some problems based on different functional regression models.Firstly,motivated by recent work studying massive functional data,such as the COVID-19 data,we propose a new dynamic interaction semiparametric function-on-scalar(DISeF)model.The proposed model is useful to explore the dynamic interaction among a set of covariates and their effects on the functional response.The proposed model includes many important models investigated recently as special cases.By tensor product B-spline approximating the unknown bivariate coefficient functions,a three-step efficient estimation procedure is developed to iteratively estimate bivariate varying-coefficient functions,the vector of index parameters,and the covariance functions of random effects.We also establish the asymptotic properties of the estimators including the convergence rate and their asymptotic distributions.In addition,we develop a test statistic to check whether the dynamic interaction varies with time/spatial locations,and prove the asymptotic normality of the test statistic.The finite sample performance of our proposed method and the test statistic is investigated with three simulation studies.Our proposed DISeF model is also used to analyzing the COVID-19 data and the ADNI data.In both applications,hypothesis testing shows that the bivariate varying-coefficient functions significantly vary with the index and the time/spatial locations.For instance,we find that the interaction effect of the population ageing and the socio-economic covariates,such as the number of hospital beds,physicians,nurses per 1,000 people and GDP per capita,on the COVID-19 death rate varies in different periods of the COVID-19 pandemic.The healthcare infrastructure index related to the COVID-19 mortality rate is also obtained for 141 countries estimated based on the proposed DISeF model.Secondly,the volume of data increases exponentially with the development of science and technology,which provides researchers more information to analyze.At the same time,despite the rapid development of computational resources,the extraordinary amount of data also brings some challenges to researchers in analyzing data.One challenge is that fitting a model using massive data needs too much memory to exceed the maximum capacity of a single computer.Moreover,the computing time is too long to obtain the results.To tackle these challenges,an effective way is to take random subsamples from the massive data as a surrogate.Motivated by the memory and computation challenges in the massive data in the scalar-on-function linear model,we propose an optimal subsampling method based on L-optimality for functional linear model through minimizes the asymptotic integrated mean squared error(IMSE)of subsampling estimator in approximating estimator based on full data.This algorithm is computationally efficient and has a significant reduction in computing time compared to the full data approach and we establish the asymptotic properties of the subsampling estimators.The finite sample performance of our proposed method and the comparisons with the uniformly subsampling method are investigated simulation studies under three scenarios.Also,we use this algorithm to analyze the global climate data from three stages with full data size n=1,028,032.The results from the analysis of this data set show that the optimal subsampling method motivated by the L-optimality criterion is better than the uniform subsampling method and can well approximate the results based on full data.Thirdly,for the scalar-on-function generalized linear model,we also propose an optimal subsampling method based on the L-optimality criterion to tackle these computing time and memory challenges.We also establish the asymptotic properties of the estimators obtained by the subsampling method.The finite sample performance of our proposed subsampling method is investigated with two simulation studies under functional logistic regression and functional Possion regression,respectively.And,we use the kidney transplant data to illustrate our proposed subsampling method for the functional generalized linear model.The finite sample performance and results of the empirical application with kidney transplant data show that our subsampling denotes the uniformly subsampling method and can well approximate the results based on full data.R code and R package have been developed for implementing the proposed methods. | Keywords/Search Tags: | Dynamic Effect, Functional Data Analysis, Functional linear regression, Functional generalized linear model, Hypothesis Testing, L-optimality, Massive data, Profile Least Squares, Subsampling, Tensor Product B-spline | PDF Full Text Request | Related items |
| |
|