Font Size: a A A

Online Support Vector Machine With GA-Based-Parameter-Optimization And Its Application To Meteorology Forecasting

Posted on:2008-09-17Degree:MasterType:Thesis
Country:ChinaCandidate:C L WuFull Text:PDF
GTID:2178360215452540Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The change of climate has large influence on the global and regional economic development, and meteorologic disasters damage people's life in different degree. Among them, long-time high temperature in summer seriously disserves the growth of crop and the health of people, and also brings the difficulty of drinking. Thus it can be seen that the damage resulting from high temperature shouldn't be despised, and the forecasting of high temperature is very significant. However, the meteorologic system is a high-order and nonlinear system with many instable factors, and its complicated inner effect and uncontrolled variety make the forecasting very difficult. With the rapidly developing of modern computer technology, the statistical method gets the large improvement breaking through the limit of simple computing and analyzing on observed data. But the results resulting from most statistical methods aren't reliable in theory unless the sample size is very large. The number of sample, as we know, is usually limited, and in many circumstances it is even small, which lead to the perfect result of traditional statistical method couldn't be obtained. In recent years, stepwise regression analysis of principal component methods, the method based on multi-regression model and the artificial neural network have been applied to the meteorologic forecasting problem, which provide many new ways for the meteorologic forecasting.The support vector machine (SVM) has been receiving increasing attention in areas ranging from its original theoretic research to the extended engineering applications since it was proposed by Vapnik in 1995 because of its good generalization performance in solving many machine learning problems. The least squares support vector regression (LSSVR) is proposed by Suykens et al. (1999). The incremental and decremental SVM training algorithms were proposed by Cauwenberghs et al. in 2001. The scale-fixed iterative least squares support vector regression (FILSSVR) algorithm is proposed by Wu et al. (2005). These improved training algorithms speed up training processes in different extent, and accelerate the application of SVM to more broad fields. However, the selection of the SVM pre-determined parameters,γandσ, is time-consuming and difficult. Genetic algorithm (GA) simulates the process of biology evolution, which can just effectively optimize combinatorial problems.In order to select the SVM pre-determined parameters more efficiently, in this desertion, we propose two SVM algorithms with GA-Based parameter optimization, and apply them to daily maximal air temperature forecasting. Main contributions of this paper are as follows:(1) In the beginning, we introduce the VC dimension, bounds on generalization ability of a learning machine, structural risk minimization principle in statistical learning theory, and expound the basic theory of Support Vector Machine, LSSVR, incremental LSSVR and decremental SVM training algorithm, and AISVR algorithm.(2) In order to forecast the time sequence of daily maximal temperature, the author proposes GA-Based parameter optimization support vector machine (GA-SVM) algorithm. With GA's automatic searching and evolving, GA-SVM could find satisfied solutions, avoiding the low-grade performance effect resulted from artificial selection of predetermined parameters. The GA-SVM algorithm needs to predetermine the sample size, the work size, population size Ps ize, the maximum of evolution G max, crossover probability Pc , mutation probability Pm and the acceptable relative errorθ. Furthermore, we must fix on regions ofγandσin SVM, so that GA can search the best solution. The GA-SVM algorithm starts with generation 0, randomly producing Ps ize individuals within the predetermined parameter regions to construct the initial population. After this, all the samples are used to train the SVM in a batch way, the beginning L samples of which make up of work set, and others are used for testing. Before this step, we must define the individual fitness, which is the forecasting correct rate of individual. When these steps are finished, GA executes selection, crossover, and mutation for continuous evolving until generation reaches G max. Till now, we obtain content parametersγandσ. Through testing a new group of sample with these parameters, we could know whether GA-SVM is superior or not.In the process of datum experiment, this paper puts forward an effective method to pretreat datum. Experiments have shown that samples pretreated by this method include more data character, and the learning speed and the forecasting precision are improved both. Results of experiment with GA-SVM on this sample set are as follows: the forecasting correct rate reached 76% on condition that the acceptable relative error is less than 0.05, the mean square error is 1.479733, and the mean relative error is 0.033379.(3) In order to improve the forecasting correct rate, the author of this paper modifies the GA-SVM algorithm mentioned above by adding the incremental and decremental SVM training algorithms into it, proposing the GA-Based parameter optimization online support vector machine (GA-OSVM) algorithm. In GA-OSVM the training window is slided along with the time swquence. Hence, the GA-OSVM algorithm has the capability of online learning. Regression parameters obtained by training on the initial work set only used to test the first sample behind the work set. After this step, we add a sample into the work set and delete the first sample of it, then train the new work set. According to this approach, we can obtain an individual's fitness only when all testing samples were forecasted. Experiment results of GA-OSVM on the sample set pretreated by the improved method are as follows: the forecasting correct rate reached 82% on condition that the acceptable relative error is less than 0.05, the mean square error is 1.347451, and the mean relative error is 0.031575. These results indicate that the GA-OSVM algorithm wonderfully exhibits the characteristic of time sequences and has obtained the preferable performance.(4) At the end of this paper, the feasibility and superiority of the GA-OSVM algorithms are verified by comparing with a popular SVM paradigm, SVMlight, and PSO-Based Hyper-Parameters Selection for LS-SVM. The GA-OSVM algorithm provides a new way for the application of GA and SVM in the meteorologic forecasting field.
Keywords/Search Tags:GA-Based-Parameter-Optimization
PDF Full Text Request
Related items