Font Size: a A A

Study Of Support Vector Machine Algorithms On Unbalanced Dataset

Posted on:2011-11-13Degree:MasterType:Thesis
Country:ChinaCandidate:W H LiuFull Text:PDF
GTID:2178330305960530Subject:Basic mathematics
Abstract/Summary:PDF Full Text Request
Support Vector Machines (SVM) was proposed by Vapnik et al. SVM is a new and outstanding learning machine and is an efficient machine-learning tool in dealing with small samples. SVM has been widely applied to many areas, such as text categorization automatically, signal processing, numeral recognition by handwriting, communication, etc. It solves the problems of over-learning, dimension curse and local minima etc. In the unbalanced classification dataset, the difference in the sample quantities of the different classes leads to declining performances of many classifiers. It is a great challenge to traditional classification problem. So the unbalanced dataset classification is a new hot topic in machine learning field. In the practical application, the less class provides more important information. Thus, how to effectively improve classifier's performance of the less class is indeed a difficult topic in machine learning.In this paper, we firstly review the basic theories of SVM and the status on the unbalanced dataset classification. For binary classification problem, if the numbers of positive class and negative class were very different, it would result into bad predict performance. Veropulos and others have improved the traditional SVM, choosing different penalty factors for different classes. We proposed an adjustment method of separating hyperplane, based on the numbers of the positive and negative class. This method improved the prediction accuracy of the positive class effectively.Seeking for the optimal parameters of SVM is an important branch of SVM. In this paper, we discuss the model of unbalanced classification datasets, which has two penalties. In its dual problem, the two penalties were regarded as the parameters of the kernel function. Combined optimization method, we proposed a new method of parameter selection in the unbalanced dataset classification for LI-SVM and L2-SVM separately.
Keywords/Search Tags:Statistical Learning Theory, Support Vector Machine, Unbalanced Dataset, Parameters Selection, Gradient Descent
PDF Full Text Request
Related items