Font Size: a A A

Research On Influenza Prediction Application Based On Twitter Social Network Data

Posted on:2020-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y B ZhengFull Text:PDF
GTID:2404330599959810Subject:Engineering
Abstract/Summary:PDF Full Text Request
With global climate change,the number of influenza cases has increased significantly,and the death rate of severe influenza is close to 50%.Therefore,it is increasingly important to predict influenza outbreaks.In existing studies,social media has been used for earthquake disaster detection and trend prediction.In this study,machine learning algorithm and big data method are combined to predict influenza outbreak.The text classification model was used to realize the data preprocessing of Twitter social network,and the time series prediction model was used to predict the trend of influenza epidemic in Queensland,Australia.The main research contents of this paper are as follows:1)XGBoost algorithm is applied to realize text classification of flu data,which solves the problem that traditional machine learning algorithm has poor effect on text classification.In order to verify the performance of this method,the Twitter data was taken as the research object.Firstly,the suspected influenza data were extracted by keyword filtering technology,and then the suspected influenza data were classified and predicted by XGBoost algorithm,and compared with the traditional naive bayesian algorithm.The experimental results show that XGBoost algorithm has a 90% accuracy rate of classification results of Twitter data,which is significantly better than the classification results of naive bayesian algorithm,and can be used to extract the frequency and time of influenza incidence in Twitter data.2)A multivariate time series prediction model based on Twitter data and temperature data is proposed,which solves the problems of poor timeliness and high relative error of prediction results of the prediction model of unitary time series.Firstly,ARIMA unary time series model was established to predict the number of influenza cases in the next four weeks through the historical data of influenza from week 1,2015 to week 32,2017.Using Twitter data and temperature data as input variables,ARIMAX multivariate time series prediction model was established to predict the number of influenza cases in the next four weeks.The results show that the average relative error of ARIMAX model is 3.30%,which is lower than ARIMA model,and ARIMAX model has better prediction effect and can be used to predict the number of influenza cases in real time.3)Aiming at the single visualization interface of influenza surveillance,a comprehensive visualization system of influenza surveillance and prediction was established.Firstly,the influenza thermodynamic map is displayed by calling Baidu MapAPI,and then the influenza distribution map,influenza history curve and future trend map are displayed by D3 visualization technology combined with geographical location data.The system explains the geographical distribution and trend development of influenza outbreak through time and space relationship network structure,and can directly guide the analysis and prevention of influenza outbreak.
Keywords/Search Tags:Twitter social network, text classification, time series model, influenza prediction, visualization
PDF Full Text Request
Related items