Font Size: a A A

Study On The Incremental Learning Algorithms For Support Vector Machines

Posted on:2009-12-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:H DuanFull Text:PDF
GTID:1118360305456864Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Support vector machine(SVM) is a new method for pattern recognition based on the statistical learning theory. It is an implementation of structure risk minimization principle in the statistical learning theory. By mapping input data into a high dimen-sional characteristic space in which an optimal separating hyperplane is built. The problem of an optimal separating hyperplane in fact is a convex quadratic program-ming with linear constraint. Although there are many effective approaches to obtain the solutions of convex quadratic programming problem, to solve large-scale prob-lems requires larger-scale store spaces and complex computation. Lagrangian Support Vector Machines (LSVM) proposed by Mangasarian et al is an amelioration on the classification problem of the standard linear support vector machine, which leads to the minimization problem of an nonnegative constrained differentiable convex function. The solution of LSVM is obtained by the iteration scheme of a simple linear conver-gence. The inversion of matrix in the solving algorithm is converted to the order of the dimensionality of the original input space plus one at the beginning of the algo-rithm by making use of the Sherman-Morrison-Woodbury identity, which can reduce the computation time.The main contributions of this thesis include five parts:(I) Online and batch incremental learning algorithms are proposed for Lagrangian Support Vector Machines. With the increment of samples, the classifier should also be updated. In the incremental learning algorithms proposed for LSVM, the inversion of matrix after increment is solved based on the former information, in which it is not necessary to repeat computing. We select nine datasets from standard pattern clas-sification library to verify the performance of the proposed algorithms. Experimental results show that the proposed algorithms really save more CPU running time and keep the training correctness. (2) With the increment of the training set, the data set requires more and more large-scale store space, thus the training speed will also be slow. In order to overcome this difficulty, the online and batch decremental learning algorithms for LSVM are presented, respectively. According to block computation of a matrix and properties of a symmetric and positive-definite matrix, an approach to compute the inversion of a matrix is obtained and applied for the decremental learning algorithms for LSVM. In the algorithms presented, it is not necessary to relearn since the inversion of a matrix after decrement is solved based on the former information. Through experiments, it is shown that the computation time of the algorithms can be reduced.(3) Since there are no effective strategies to determine which samples to be dis-carded or selected in the incremental and decremental learning processes, the training set will be larger and larger and the computing time and correctness will also be in-fluenced. Therefore, how to select samples to add or discard is important. We present a new online incremental and decrement LSVM algorithm with sample selecting strat-egy. In our algorithm, the sample stamp is used to determine which samples should be discarded, and the inversion of matrix after discarding is solved based on the previous information instead of relearning the new training set. The experimental results show a better performance in comparison with the existing algorithms.(4) Online and batch incremental regression learning algorithms are proposed for Lagrangian Support Vector Regressions. While new samples are added, the inversion of matrix after increment is solved based on the previous results, therefore it is not necessary to relearn the whole training set to reduce the computation process. Ex-perimental results on three real data sets and two generated data sets show that the availability of these two algorithms. Finally, the online incremental regression learning for LSVR are implemented and tested it on Mackey-Glass time series to compare the performances with SVMlight and SOR-SVR algorithms. According to the experiment results, we achieve a high-quality prediction about time series.(5) Based on two-class classification problems of LSVM, One-vs-One multi-class and One-vs-All multi-class classification algorithms are proposed for LSVM. In the One-vs-One multi-class classification algorithm of LSVM, it is necessary to solveκ two-class classifications of LSVM, and in the One-vs-All multi-class classification al-gorithm, it requires to solveκ(κ-1)/2 two-class classifications of LSVM, whereκis the number of the classes. While in the One-vs-One multi-class and One-vs-All multi-class classification algorithms of standard SVM, it requires to solve a quadratic programming problem with more cost. The experimental results in the linear and non-linear cases indicate that the CPU running time of these two algorithms for LSVM is shorter than that of the standard SVM, while their training correctness and testing correctness are almost identical. Finally, the online incremental learning algorithms for One-vs-One multi-class and One-vs-All multi-class classification of LSVM are pre-sented, respectively, which are based on the online incremental learning algorithms for two-class classification problems of LSVM.Finally, the future research work related to this thesis is presented.
Keywords/Search Tags:Machine Learning, Statistical Learning Theory, Support Vector Machine, Classification Problem, Regression Problem, Incremental Learning, Time Series, Multi-classification
PDF Full Text Request
Related items