| In real life, the most collected data, especially in the economic and b iomedical areas, does not strictly obey normal distribution, but could be described by Skew-t-Normal distribution which has a very heavy tail or suffers from severe outliers. So the statistical analysis for skew-t-normal data is theoretically and practically significant. Moreover, in the collection process of real data, a lot of sample survey data and experimental data will be interferenced because of no answer factors or will be lost for some other reasons. Therefore, it is necessary for research on analys is of skew-t-normal data.Moreover, in terms of this dataset with heavy-tailed distribution or outliers, both ordinary least square(OLS) method and maximum likelihood(ML) method are sensitive to outliers so that the effectiveness of parameter estimation for these two methods seems not good. However, the least absolute deviance(LAD) method is resistant to these outliers and has better robustness than the O LS and ML methods. Besides, statistical results of the LAD method are independent of the errors’ distribution. Based on these theoretical findings, this paper completes the following work and obtains these results as follows:Firstly, for missing skew-t- normal data, in order to improve the robustness of parameter estimation, make the sample distribution more close to the real distribution and the estimation of regression coefficient more accurate, least absolute deviation method was used to estimate the regression coefficients of the linear regression model and decrease the bad effect resulting from outliers. And for missing data, we further employed regression imputation method to enhance the effectiveness of parameter estimation. Compared with the maximum likelihood estimation in stochastic simulation, we proved that the proposed method in this paper is feasible and effective.Secondly, for variable selection, in order to further verify the robustness of the LAD method, we use the LAD- lasso method by Wang et al. to realize real example analysis for Shanghai Medicine and Health index, which obeys the skew-t-normal distribution. By comparing with the OLS- lasso method, the performance of variable selection based on the LAD-lasso method is better for skew-t-normal distribution data. |