| Objective:By analyzing the regularity of human brucellosis in Liaoning Province from2008 to 2017 and the importance of meteorological characteristics,the application of seasonal ARIMA model and XGBoost machine learning model in short-term prediction of brucellosis was explored.The prediction results of ARIMA model and XGBoost machine learning model were compared by MAE,RMSE and MAPE indexes,and the model which was most suitable for the disease prediction in Liaoning Province was selected.It provides new ideas for accurate prediction of infectious diseases and provides scientific basis for formulating prevention and early warning strategies of infectious diseases in Liaoning Province.Methods:The incidence lag data and meteorological lag data of human brucellosis in Liaoning Province from January 2008 to December 2016 were used as the training set,and the incidence data of human brucellosis in Liaoning Province from January 2017 to December 2017 were used as the test set.The ARIMA model was established with the incidence data of brucellosis in the training set,and the prediction effect of ARIMA model was tested with the test set;the XGBoost machine learning model(excluding meteorological factors)was established with the incidence lag data of brucellosis in the training set,and the prediction effect of XGBoost machine learning model was tested with the test set;the brucellosis in the training set was tested with the random forest cross validation method The characteristic variables were selected from the lag data of disease incidence and meteorology.The XGBoost machine learning model(including meteorological factors)was established by using the characteristic variables,and the prediction effect of XGBoost machine learning model was tested by using the test set.Results:1.From 2008 to 2017,the incidence of human brucellosis in Liaoning Province showed an increasing trend in spring,with the highest incidence in May;it showed a downward trend in autumn,with the lowest incidence in December.2.Through random forest cross validation,the training set was selected and 10groups of numerical variables were obtained.According to their importance,the order is as follows:the number of disease lags 12 months,the numb er of disease lags 1 month,the number of disease lags 11 months,the number of disease lags 2 months,the number of disease lags 10 months,the humidity lags 7 months,the temperature lags 10 months,the air pressure lags 4 months,and the air pressure lags 1 month 1 month,humidity lag 2months.3.The MAE,RMSE and MAPE of the training set of seasonal ARIMA(0,1,2)×(0,1,1)[12]model were 18.842,25.975 and 16.749%respectively;the MAE,RMSE and MAPE of the test set were 49.653,58.970 and 29.122%respectively.The MAE,RMSE and MAPE of the training set of XGBoost model(excluding meteorological factors)were12.248%,19.013%and 11.622%respectively,and the MAE,RMSE and MAPE of the test set were 39.687%,44.449%and 26.303%respectively.The MAE,RMSE and MAPE of the training set of XGBoost model(including meteorological factors)were 11.777%,18.560%and 10.276%respectively;the MAE,RMSE and MAPE of the test set were28.955%,37.864%and 17.973%respectively.Conclusion:1.The incidence of human brucellosis in Liaoning Province from 2008 to 2017showed obvious seasonality.2.For the prediction of human brucellosis incidence in Liaoning Province from2008 to 2017,XGBoost model(excluding meteorological factors)has higher prediction accuracy than product seasonal ARIMA(0,1,2)×(0,1,1)[12]model.3.For the prediction of human brucellosis incidence in Liaoning Province from2008 to 2017,XGBoost model(including meteorological factors)has higher prediction accuracy than XGBoost model(excluding meteorological factors)and product seasonal ARIMA(0,1,2)×(0,1,1)[12]model.4.For the prediction of human brucellosis incidence in Liaoning Province from2008 to 2017,adding meteorological data can greatly improve the prediction accuracy of XGBoost model. |