Longitudinal data are data obtained from repeated observations of multiple covariates for each individual at different times,with inter-group independence and intra-group correlation,across a wide range of industries including biology,sociology,epidemiology and economics.In this paper,we focus on discovering potential categories of individuals with similar longitudinal trajectories,and explore the key variables in each category that have a strong influence on the response variable.Based on this,we assume that the response variables obey Gaussian mixture distribution to cluster individuals and discover potential categories among individuals,use linear mixture model to portray the structure of independent between groups and correlation within groups for longitudinal data,select SCAD penalty and group SCAD penalty for fixed effects and random effects simultaneously,and combine and model the three models simultaneously in order to reduce the loss caused by stepwise modeling,using EM algorithm is implemented.Numerical simulations show that the method has good results in parameter estimation,variable selection and longitudinal data clustering.Finally,the effectiveness of the model in practical application is demonstrated by the data of the follow-up survey on health influencing factors of the elderly in China. |