Font Size: a A A

Research On Multiple Interpolation And Multi-Level Model Based On Longitudinal Missing Data

Posted on:2024-03-30Degree:MasterType:Thesis
Country:ChinaCandidate:X Y LiFull Text:PDF
GTID:2530307136452424Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet and big data technology,data has been integrated into every field of society and become a crucial factor of production.The types of data are very rich,among which the longitudinal data is a special form,which is the collection of observation values repeatedly measured by the same individual in different time or space.They are widely born in the fields of psychology,sociology and economics.Long-term follow-up of the same subjects is required in longitudinal studies,which will inevitably lead to missing values in longitudinal data.In order to solve this problem better,exploring the processing of longitudinal missing data by multiple interpolation and multi-level model was focused on in this paper.Firstly,simulation study was contacted,a complete longitudinal data set was generated using R software.The intra-group correlation coefficient was calculated to determine that the data was suitable for the multi-level model and the generalized estimation equation.By comparing the results of the two models,the analysis result of the multi-level model was selected as the control.Then,R software was used to construct the longitudinal data set into random missing and completely random missing data sets respectively,and the multi-level model and generalized estimation equation analysis were carried out for these two data sets in turn.Finally,the Markov chain Monte Carlo method in multiple interpolation was used to fill in the two data sets into new complete data sets,and the multi-level model and generalized estimation equation were used again for analysis.Based on the two evaluation indexes of mean relative deviation and root mean square error,the parameter estimates and standard errors obtained from these models were compared.It was found that the optimal combination model of multiple interpolation and multi-level model has better performance in processing longitudinal missing data.In order to verify the practicability of the optimized combination model,the result of the simulation study was applied to the children lead poisoning test and the Fradingen study on chronic obstructive pulmonary disease in this paper.Firstly,the data from the childhood lead poisoning test were analyzed using the multi-level model to obtain the parameter estimates and standard errors of all variables,and this result was used as a control.Then,the TLC data were constructed into completely random missing and random missing data sets using R 4.0.2 software and Python 3.7 software respectively,and the constructed data sets were analyzed using multi-level model and generalized estimation equation in turn.Finally,the Markov chain Monte Carlo method in multiple interpolation was used to fill in the above two missing data sets into complete data sets,and the two new data sets were analyzed again using the multi-level model and the generalized estimation equation.In the case of Fradingen’s study of chronic obstructive pulmonary disease,the same procedures were used.The results obtained from the two examples were consistent with the result of the simulation study,both of which indicate that the optimized combination model of multiple interpolation and multi-level model has better performance in processing longitudinal missing data.The above research results can be applied to many research fields with longitudinal missing data as the background,which can optimize the results and improve the accuracy well.
Keywords/Search Tags:Longitudinal data, Multiple interpolation, Multilevel model, Generalized estimation equation, Mean relative deviation
PDF Full Text Request
Related items