Font Size: a A A

Support Vector Machine Learning Under Noisy And Overlapping Data

Posted on:2003-07-04Degree:MasterType:Thesis
Country:ChinaCandidate:S W JiFull Text:PDF
GTID:2168360062996407Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
process that can improve the performance of a system can be called learning. It is generally believed that learning is at the very core of the problem of biological and artificial intelligence. With the recent development of Internet and database technology, vast amount of complex, high-dimensional data emerge overwhelmingly. This creates the urgent need of constructing efficient nonlinear adaptive information processing algorithms for complex data analysis.Statistical learning theory derives necessary and sufficient conditions for consistency and fast rate of convergence of the Empirical Risk Minimization principle, which is the basis of most traditional learning algorithms. It also theoretically underpins the support vector algorithms. Support vector learning algorithm is based on Structural Risk Minimization principle. It combines two remarkable ideas: maximum margin classifiers with low capacity and therefore good generalization, and implicit feature spaces defined by kernel function. Theoretically underpinned by these ideas, SVM can overcome the curse-of-dimensionality and over-fitting problems elegantly. Due to its remarkable performance on many applications, SVM is receiving increasing attention now.Training SVM can be formulated into a quadratic programming problem. For large learning tasks with many training examples, off-the-shelf optimization techniques quickly become intractable in their memory and time requirements. Thus, many efficient techniques have been developed. These techniques divide the original problem into several smaller sub-problems. By solving these sub-problems iteratively, the original larger problem is solved. All the proposed methods suffer from the bottleneck of long training tune, especially when the data are noisy and overlapping.It is observed experimentally and algorithmically that when training data are noisy and overlapping, many support vectors have Lagrange multipliers on the upper bound. If it were known beforehand which examples are bound support vectors, these examples could be removed from the training set and their values are fixed at the upper bound. Due to the reduced free variable counts, this method is promising toimprove training time. According to the theoretical derivation of support vector machine training process, the training result is not compromised as long as all training examples satisfy the optimality condition before terminating. This thesis proposes a heuristic rule to decide which examples are bound support vectors based on their past behavioral statistics. If one example reaches the bound successively for some given times, it is regarded as bound support vectors and therefore removed from the training set. In the following training process, their values are fixed at the upper bound and do not included in the working set. In order to ensure the final solution is globally optimal, the modified training algorithm checks every bound support vectors against optimality condition before terminating. If there still exists some bound support vectors violating the optimality condition the training process is restarted until all training examples reach their optimal points. The original and modified algorithms are compared on many standard benchmarking datasets, including face detection, USPS, and MNIST handwritten digit recognition problems. Numerical results show that the modified algorithm out-performs the original algorithm with respect to training tune in most cases, especially when the training data are noisy and overlapping.
Keywords/Search Tags:Statistical Learning Theory, Support Vector Machines, Complexity Regularization, Machine Learning, Pattern Recognition
PDF Full Text Request
Related items