Font Size: a A A

Infectious Diseases Prediction Model Based On Support Vector Regression

Posted on:2016-06-08Degree:MasterType:Thesis
Country:ChinaCandidate:L YuFull Text:PDF
GTID:2308330470957823Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The outbreak of infectious diseases will not only cause huge loss to human but also become a major challenge and a forbidding threat to human. Forecasting the trend of the epidemic outbreaks can enable people prepare for the disaster early and further reduce the loss. So it is of great significance to make prediction.The theory of Support Vector Machine (SVM), among the deepening machine learning research, has been proposed to provide a new method for prediction. Due to the fact that the Support Vector Machine is good at dealing with the data of infectious diseases which always present small samples, irregular and nonlinear characteristics, the Support Vector Regression (SVR) is introduced into the epidemic prediction in this paper. Meanwhile, because of the specific cycle scheduling of infectious diseases, the Autoregressive Integrated Moving Average Model (ARIMA) is also used in this paper, which, combined with the SVR, has obtained good prediction result.Firstly, the kernel function is central to Support Vector Machine theory, which maps the low dimensional nonlinear model to the high dimensional linear model and avoids the dimension disaster. In order to obtain better prediction result, we choose the mixed kernel function, a linear combination of the global kernel function and local kernel function equipped with better learning ability and generalization ability. In this paper, we analysis the geometric meaning of SVR and put forward a calculation method based on the distance of characteristics to get the combination coefficient. Considering the relationship between SVR and SVC, this method transforms regression problem to classification problem. According to the principle that the larger the distance between samples of different categories is, the better the classification problem will be, this method first simplifies the optimized objective function to get a quadratic function and then solves the combination coefficient. Experiments prove that compared with the traditional cross validation method and the PSO optimization algorithm, this method is more effective to get the combination coefficient directly through formula.In addition, considering the ARIMA model has advantage in dealing with scheduling and periodicity of the epidemic outbreaks, this article, from the thought of combination model, put forward a new forecasting model combining ARIMA and SVR. The new model takes not only the influence of meteorological factors on infectious diseases but also its own cyclical into account, which can further improve the prediction accuracy and robustness.Finally, the incidence of tuberculosis is chosen as forecast target. After studying Chinese Traditional Medicine, we put forward a method to quantize "yun" and "qi" and add them into the input characteristics. Then we apply principal component analysis to reduce the dimension of meteorological data. The SVR model, ARIMA model and the ARIMA-SVR combination model are used respectively to forecast the incidence of infectious diseases. Experimental results show that when the SVR model is used, it is most effective to use a calculation method based on the distance of characteristics to get the combination coefficient. And the prediction relative errors of SVR model and ARIMA model are about10%and15%respectively, however, the combined model can control the relative error within5%or so, which proves the effectiveness of the combined model.
Keywords/Search Tags:SVM, ARIMA, Mixed Kernel Function, Infectious Disease, Prediction
PDF Full Text Request
Related items