Font Size: a A A

Study On Several Issues Of V-support Vector Classification

Posted on:2017-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:Z ZhuFull Text:PDF
GTID:2308330485470830Subject:Operational Research and Cybernetics
Abstract/Summary:PDF Full Text Request
Support Vector Classification is a type of machine learning algorithms by Vapnik et.al, who proposed in 1995, it is based on statistical learning theory, with a simple mathematical form, standard and efficient training methods. Unlike traditional based on experience risk minimization principle of machine learning algorithms, such as:nueral networks, decision trees, etc. Support Vector Classifier is based on structure risk minimization principle, consider the sum of empirical risk and confidence border minimization. Because of this, support vector classifier attracted much attention and has been rapid improvement and development in the 1990s. And it has a solid theoretical foundation, stronger generalization ability and good performance to solve practical problems with small samples and nonlineari-ty. Thus become one of the mainstream machine learning algorithms. Currently SVC is widely used in pattern recognition, bioinformatics, text and handwriting recognition, etc.Due to the parameter C in the C-SVC has rang to 0 from+∞ and does not have a quantitative meaning. In practice, the value is often difficult to choose. So Scholkopf et al. proposed v-SVC in 2000 to use the parameter v instead of C. By replacing, the value of parameter v has a certain significance. The main work is done as follows:(1) explore relationship between parameter v in v-SVC and the total numbers of samples. Use Markov inequality and related theories of probability theory, to prove the parameter v gradual approximation for the ratio of the numbers of support vectors and the total numbers of samples.(2) study on mapping relationship between parameter v in v-SVC and pa-rameter C in C-SVC. Download breast-cancer-wisconsin (diagnostic), iris, letter-recognition from UCI database, using RStudio programming C-support vector classification and experimental results show that:when select non-positive kernel function on sample training set the mapping relationship between v and C is not always non-increasing.(3) study on how to obtain optimal classifier base on select optimal pa-rameter v with precision and recall. Download five standard datasets from U-CI database that is balance-scale, breast-cancer-wisconsin (original), iris, letter-recognition, waveform. Using RStudio programming v-SVC and experimenting. we will take 2/3 dataset as training set and- 1/3 as test set, and calculate the precision and recall when apply different v value. The experimental results show that:in optimal parameters interval, v-SVC maintained a high precision and recall in the training and test sets.
Keywords/Search Tags:v-SVC, Kernel Function, Confusion Matrix, Precision, Recall, F- value
PDF Full Text Request
Related items