Font Size: a A A

Study On Least Square Support Vector Machine Algorithms And Their Applications

Posted on:2009-03-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:X C GuoFull Text:PDF
GTID:1118360272976433Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Support Vector Machine (SVM) is a new and novel powerful machine learning method which is proposed on the basis of the framework of Statistical Learning Theory (SLT). SVM is the learning machine designed for the special case of the number of the limited samples. As a general learning machine, solving SVM is essentially equivalent to solving a convex quadratic programming problem. Because SVM is based on the structural risk minimization (SRM) principle, it can simultaneously control the empirical risk and the complexity of the learning machine. So SVM can effectively avoid the phenomenon of over-fitting and obtain better generalization performance than traditional learning method base on the empirical risk minimization (ERM). Least squares support vector machines (LS-SVMs) are introduced by Suykens et al. as reformulations to standard SVMs which simplify the training process of standard SVM in a great extent by replacing the inequality constraints with equality ones. The introduction of least square support vector machine reduces the complexity of computing and accelerates the speed of calculation. Therefore, support vector machines are greatly promoted for the wider applications. Now, statistical learning theory and support vector machine are considered as best learning theory for the limited samples, and are paid more and more attention, are becoming a new hot spot in the research field of the machine learning and artificial intelligence. This dissertation mainly focuses on the study of LS-SVM algorithms and their applications. The main contributions are as follows:1. PSO-based hyper-parameters selection for LS-SVM. SVM is a kernel-based learning algorithm. So the problems of model selection can not bypassed in applications. One is how to select kernel function. In theory, although the function which meets the mercer's condition can be considered as kernel function, the performance of SVM is different for different kernel function. Even if a certain type of kernel functions have been chosen, its corresponding parameters (such as the order parameter of polynomial kernel functions, the width parameter of radial basis kernel functions) need to selected and optimized. The parametersγin regularization term and kernel function are often called hyper-parameters in SVMs, which play an important role to the algorithm performance. The existing techniques for adjusting the hyper-parameters in SVMs can be summarized into two kinds: one is based on analytical techniques; the other is based on heuristic searches. The first kind of techniques determines the hyper-parameters with gradients of some generalized error measures. Iterative gradient-based algorithms rely on smoothed approximations of a function. So, it does not ensure that the search direction points exactly to an optimum of the generalization performance measure which is often discontinuous. And the second kind of techniques determines the hyper-parameters with modern heuristic algorithms including genetic algorithms, simulated annealing algorithms and other evolutionary strategies. Grid search is one of the conventional approaches to deal with discontinuous problems. However, it needs an exhaustive search over the space of hyper-parameters, which must be time-consuming. This procedure needs to locate the interval of feasible solution and a suitable sampling step. Moreover, when there are more than two hyper-parameters, the manual model selection may become intractable. PSO developed by Eberhart and Kennedy in 1995 is a stochastic global optimization technique inspired by social behavior of bird flocking. Similar to GAs and EAs, PSO is a population based optimization tool, which searches for optima by updating generations. However, unlike GAs and EAs, PSO does not need evolutionary operators such as crossover and mutation. Compared to GAs and EAs, the advantages of PSO are that PSO possesses the capability to escape from local optima, is easy to be implemented, and has fewer parameters to be adjusted. PSO has been successfully applied to optimization, artificial network training, fuzzy system control, and etc. The PSO has been found to be robust and fast in solving non-linear, non-differentiable and multi-modal problems. In this dissertation, a novel PSO-based hyper-parameter selection for LS-SVMs classifiers is presented. The proposed method does not need to consider the analytic property of the generalization performance measure and can determine multiple hyper-parameters at the same time. The feasibility of this method is evaluated using benchmark data sets. The hyper-parameters of LS-SVMs with linear kernel function, polynomial kernel function, radial basis kernel function and scaling radial basis kernel function are optimized. Experimental results show that better performance can be obtained. Considering the importance of all input components for the classification problem, SRBF kernel function takes different scaling factors for all input components. Experimental results also show that the SRBF kernel yields the best test performance and the polynomial and RBF kernel give better test performance. Compared with the results of other methods, the proposed PSO-based hyper-parameter selection for LS-SVMs yields higher accurate rate for all data sets tested in this dissertation. So the proposed method is efficient.2. Near infrared spectroscopy (NIRS) and LS-SVMs-based concentration analysis. Near infrared spectroscopy technique is a rapid analysis, non-destructive technique and does not need any sample preparation. Therefore, it widely used in many application fields. With the use of spectroscopy and selecting the optimal calibration wavelength, a strong and stable regression method is required. According to regression model constructed, material concentration in the mixture can be forecasted through inputting the NIRS data. In this dissertation we analyze the NIRS data of different concentration of water and ethanol binary mixture and find that NIR spectra distribution in the region of 1400~1570 nm and 1750~1850nm is dispersed. This shows that these regions include contain the most information of the alcohol concentration. According to the analysis above, the spectra energy index is define to determine the wavelength region (wave band) used to make regression model. An algorithm to select the optimal wave band was proposed. Finally, LS-SVM regression model constructed according to the chosen optimal wave band determine the alcohol concentration in mixture. The experiments results of concentration prediction of water-ethanol mixtures show that the sample construction approach is effective and the LS-SVM outperforms the conventional artificial neural networks (ANN) and partial least square regression (PLSR). The proposed method can be used to predict concentration of other more complexity mixtures.3. LS-SVMs-based solve the second king integral equations. Integral equations are the mathematics formulations to describe the physical laws of study object. The mathematical models of many scientific and engineering problems can be summed up as integral equations. Ordinary differential equations usually require additional initial conditions or boundary conditions, while integral equations itself contains the initial value or boundary information. In particular, for the numerical solution of the equation, the approaches to integral equations are often easier and more direct than that of differential equations. All kind of methods of integral equations are built on the basis of the basic theory of integral equations. The support vector machine (SVM) based on statistical learning theory is a powerful tool for function regression and pattern recognition. Motivated by the powerful regression ability of SVMs, we propose a hybrid approach based on LS-SVMs and trapezoid quadrature to solve the second kind linear Volterra integral equations. We approximate the unknown function f (x) by using LS-SVMs and use the approximation of f (x) step by step in the subsequential numerical solution. Results of comparison with analytic solutions show that the proposed algorithm could reach a very high accuracy. The maximization absolute errors are not beyond 10-6 magnitude order, which shows that the proposed algorithm could reach a quite agreeable accuracy. The proposed method is compared with the repeated modified trapezoid quadrature method which shows very good accuracy for solving linear integral equations. The results show that our method outperforms slightly the existing method. Therefore we can conclude that the proposed method is feasible in solving numerically linear Volterra integral equations. 4. LS-SVMs based for electric power short term load forecasting. Electric power load forecasting is the basis of making decision for power network planning and the precondition of the electric power market. Load forecasting precision is directly related to whether or not to provide safe and high quality power for customers and guarantee power system to run economically. It is important for power department to improve economic benefit. According to the analysis of electric power load law, we find that meteorological factors (maximum temperature, minimum temperature, average temperature, humidity and weather type etc.) in a day have a more important influence on electric power load. Weather status similarity degree is defined because the days with similar weather conditions also have similar load value relatively. Dynamic training samples set can be composed of some historical records with similar weather conditions to forecasting day selected from the whole historical load data. So the time of training model can be saved and the disturbance from irrelevant samples can be avoided. Load forecasting is considered as a kind of multi-character and large-scale problem. Day load curves with similar weather conditions are similar. Principal component analysis is carried out for sample set. So the input feature can be reduced. Finally, LS-SVMs model is made to forecast load. Example prediction results show that forecasting precision of LS-SVMs model outperforms ANN.SVM has a solid theoretical foundation and good generalization, so it will be used widely. In this dissertation, we study the improvement and application SVM. This research work will promote the theoretical study of the algorithm and expand its application in the field of pattern recognition.
Keywords/Search Tags:Machine Learning, Statistical Learning Theory, Kernel Function, Least Square Support Vector Machine, Particle Swarm Optimization, Hyper-parameters Selection, Near Infrared Spectroscopy, Integrate Equations, Electric Power Load Forecasting
PDF Full Text Request
Related items