Font Size: a A A

Research On Support Vector Machine Models And Algorithms For Imbalanced Data

Posted on:2016-11-23Degree:DoctorType:Dissertation
Country:ChinaCandidate:J J ZhangFull Text:PDF
GTID:1228330467492189Subject:Strategy and management
Abstract/Summary:PDF Full Text Request
Support Vector Machine (SVM) is an important kind of machine learning method based on Statistical Learning Theory. SVM is based on structure risk minimization principle, and it takes the kernel techniques in the nonlinear problem. The kernel techniques make the nonlinear problem to the linear problem in the feature space. The kernel techniques avoid over fitting and the dimension disaster in a certain extent. The SVM and its modified models have some achievements on the imbalanced problem.Currently, learning from imbalanced datasets is an important research subject in machine learn-ing. The main idea of the class imbalanced problem is that one class is represented by a large number of samples while the other is represented by only a few samples. The separating hyperplane is always skewed toward the minority class in traditional algorithms. However, this phenomenon leads to the minority class misclassified more easily than the majority class. Actually, the minority class always has more important role in practical application. Therefore, to improve the accuracy of the minority class is an important and meaningful research.In this paper, we make some research to deal with the imbalanced problem. Three improved variants are in this paper as follows:1. WWCS-BSVM:In the class imbalanced problem, the weighted within-class scatter has an impact on the classifier. We give different weights for two within-class scatter. A smaller weight is for the minority class, while a bigger weight is for the majority class. The aim is to make the minority class more closely. Our WWCS-BSVM improves the accuracy of the minority class. The experimental results have verified the effectiveness of the WWCS-BSVM.2. LSFOCSVM:The one-class SVM (OCSVM) has excellent effect on the class imbalanced problem. The algorithm takes the same penalty parameter for each sample in the training progress. However, the training samples make different effect on constructing the classification hyperplane. We introduced the fuzzy membership in the LSFOCSVM and the solution can be obtained by solving a linear equations.3. LSOCSVFM:The input sample information is always uncertain. We study this case on fuzzy set, and proposed a least squares version of the OCSVM on fuzzy set to solve the class imbalanced problem. In the LSOCSVFM, the parameters are fuzzy number, such as the weight vector and bias. And the final solution of the proposed model can be obtained by solving a linear equations. The experimental results have verified the effectiveness of the LSOCSVFM.
Keywords/Search Tags:Support vector machine, Imbalanced data, Within-class scatter, One classClassification, Least squares
PDF Full Text Request
Related items