Font Size: a A A

Support Vector Machine Based On Boundary Sample Selection

Posted on:2015-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:C LiFull Text:PDF
GTID:2268330422469474Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Support vector machine (SVM) is a data mining method which is based on the statisticallearning theory and the structural risk minimization principle. SVM has good generalizationability and is widely used in solving classification and regression problems. Because thecomputational time complexity to solve the quadratic programming of SVM is O(n3) and thespace complexity is O(n2), the training time is very long and the demand of memory space isalso very high when dealing with large-scale data problems.As is well known, the classification hyperplane is decided only by the support vectors(SVs), and most of the SVs are located near the classification border. Based on thischaracteristic, in order to deal with the problem mentioned above, this paper presents twoSVM based on boundary sample selection, the proposed methods first employ probabilisticneural network (PNN) and extreme learning machine (ELM) to select boundary samples ascandidate support vectors (CSVs), and then train a SVM with CSVs. We use K-L divergenceand entropy as heuristics to select the CSVs. The value of K-L divergence is larger means thatthe difference between the desired distribution and really distributed is larger. In other words,this sample is not easy to distinguish. Usually these samples distribute on the classificationborder. In the other case, the value of K-L entropy is larger means that the sample containsmore uncertain information, and has more influence on the location of the classificationhyperplane.So these samples also distribute on the classification border.We conducted experiments on two artificial datasets and thirteen UCI datasets, and theexperimental results verified that the proposed methods can obtain promising performance intest accuracy and training time on larger datasets. According to the results, we summarized abasic framework of training SVM based on the boundary sample selection.
Keywords/Search Tags:Support Vector Machine, Sample Selection, K-L Divergence, ProbabilisticNeural Network, Extreme Learning Machine
PDF Full Text Request
Related items