Font Size: a A A

A Study Of Influenza Epidemic Character Istics,Influencing Factors And Model Prediction In China

Posted on:2021-09-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:P L ZhongFull Text:PDF
GTID:1484306038475354Subject:Chinese medicine
Abstract/Summary:PDF Full Text Request
ObjectiveBased on influenza-like illness(ILI)case data from 2005 to 2017 in China,the corresponding meteorology data and air quality data from 2015 to 2017,this study aimed to find out the influenza epidemic characteristics and influencing factors,establish nationwide,northern,southern,provincial and urban prediction model,predict the number of cases of the above region,and evaluate the forecast effect and performance in our c ountry.Methods1.Contour analysis was adopted to study the influenza epidemic characteristics in China by using R language and Python programming software to achieve data visualization,extract statistical features,and discuss the differences between the north and south.2.Pearson correlation analysis and GBDT characteristic importanceanalysis were used to find out the influencing factors of influenza in China.The first step was to directly observe the relationship among the ILI data,meteorology and air quality data by drawing the graph.In the second step,correlation coefficient matrix was used to analyze the ILI data,meteorological data and air quality data to extract important influence factors or to exclude the linear correlation factors.In the third step,GBDT was going to be used to rank the importance of features,and the common influence factor obtained with large weight would be input as one of the variables of the next model prediction.3.Using Time Series(TS)and Neural Network(NN)research methods,ARIMA and LSTM models wer e established to predict.The prediction effect and performance of ARIMA and LSTM models based on historical influenza data were compared at the national,northern,southern and provincial levels.At the city level,the prediction effect and performance of the single-factor LSTM model and the four multi-factor LSTM models were compared.Single-factor refers to the historical data of influenza,while multi-factor refers to the superposition of influencing factors on the historical data of influenza.Meanwhile,to verify the effectiveness of the selection of influencing factors,the pairwise comparison of prediction performance was used among the LSTM models with different superposition influencing factors.Results1.Influenza presents the epidemic characteristics of winter peak,spring sub-peak,summer and autumn trough in China as a whole.In the northern region,the single peak in winter and the trough in summer are the main features.In south China,there are winter peak,spring peak,summer peak and autumn trough.2.By comparing the construction characteristics of direct observation,Pearson correlation coefficient matrix analysis,and GBDT importance analysis,referencing the related research results,four groups of factors were selected:?the average temperature,average pressure and average relative humidity,?the average temperature,?the average absolute humidity,?the average temperature and the average absolute humidity,which would be added to construct prediction model of multiple factors.3.At the national,southern,northern and provincial level,the Time Series ARIMA model and Neural Network LSTM model were respectively adopted to predict based on the historical incidence data of influenza.The RMSE mean value of LSTM was 435.53,which was significantly lower than that of ARIMA' s 662.92.At the city level,single factor LSTM model,multi-factor LSTM-1 model with superposition of the influencing factors ?,multi-factor LSTM-2 model with factors?,multi-factor LSTM-3 model with factors?,and multi-factor model LSTM-4 with factors?,on the test set the mean RMSE were 81.81,73.46,84.42,74.90,75.54.Arranged by the mean RMSE in the order of small and large,models are LSTM-1,LSTM-3,LSTM-4,LSTM,LSTM-2.According to the Friedman test of comparing multiple relevant samples,P=0.03104(P<0.05),the prediction performance of the five models was not all the same.According to the paired Wilcoxon test,the differece of prediction performance of LSTM-1 and LSTM-2(P=0.0205,P<0.05)and LSTM-3 and LSTM-4(P=0.0057,P<0.05)were statistically significant.There were 2 models at the provincial level and 5 models at the city level,a total of 7 models,whose RMSE mean values of prediction performance in the south and the north were compared.All the results are the same that the RMSE mean value in the north was smaller than in the south.Wilcoxon test of the two independent samples from the south and the north showed no statistical significant difference.Conclusion1.Low temperature may be conducive to the onset and prevalence of influenza,while warm and dry or hot and dry weather may inhibit.In warm or hot weather conditions,higher humidity can be an advantage for influenza epidemics.It may not be appropriate to simply divide the epidemic area by north and south or by latitude.Influenza epidemics are affected by specific climatic or meteorological factors.2.There is almost no linear correlation between the incidence of influenza and meteorological factors and air quality factors.Temperature is not the only or decisive factor affecting the influenza epidemics.While absolute humidity may be the only or decisive factor.3.In terms of prediction model,the neural network LSTM model has excellent performance and is suitable for short and medium-long term prediction of influenza.As for the superposition of influencing factors,appropriate influencing factors can improve the prediction performance of the model,while the improper superposition of factors may reduce it.In the evaluation of prediction performance,RMSE is affected by the accuracy of prediction and the number of influenza cases.In terms of the influencing factors of the prediction effect,the prediction effect of influenza is mainly affected by the data of influenza incidence and the choice of model,rather than the differences between the attributes of the north and south,or the differences among the climatic zones,or the complexity of influenza epidemic characteristics.In terms of how the incidence data affect the prediction effect,the extreme peak of the incidence data seriously affects the prediction effect of the model.If the peak value on the training set is an extreme value,it can easily lead to a failure in the model prediction and even in the model fitting.If the peak value on the test set is an extreme value,it can easily cause a failure at the area of extreme value.In the scope and significance of model prediction,based on the principle of prediction model and the research practice of 31 provinces and 32 cities in China with a span of 13 years,for the prediction model it is difficult to deal with extreme value problem properly,therefore model prediction may apply only to seasonal influenza epidemics,not to an influenza pandemic.
Keywords/Search Tags:influenza, epidemic characteristics, influencing factors, model, prediction, LSTM
PDF Full Text Request
Related items