Research On Fast Algorithms For Cost Sensitive Support Vector Machine

Posted on:2017-01-04

Degree:Master

Type:Thesis

Country:China

Candidate:X Quan

Full Text:PDF

GTID:2308330485998924

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Support vector machine is a kind of classification algorithm, which is proposed by Vapnik et al. It has been widely used in machine learning and data mining field because of the excellent generalization performance. It assumes that misclassifying samples belong to different classes would lead to the same cost in conventional classification algorithms. However, misclassifying samples belong to different classes would lead to different cost in many real applications, such as disease diagnosis, credit card fraud detection and others. To handle this kind of application, which is called cost sensitive problem, researchers proposed many cost sensitive algorithms. Cost sensitive support vector machine among those cost sensitive algorithms has excellent performance and widely adaptability. In this paper, we focus on the cost sensitive support vector machine. And the innovation research results are as follows.(1) For cost sensitive problem, we design a series of contrast experiments to compare multiple cost sensitive algorithms. In the experiments, we use fourteen datasets including ten cost sensitive datasets and four imbalanced datasets. To compare these algorithms, we evaluate these algorithms with four performance measure including total cost, AUC, F1-measure and G-mean, which is widely used to evaluate cost sensitive algorithms. Through the contrast experiments, we find out that cost sensitive support vector machine have better performance than other algorithms and can adapt to multiple datasets from different fields.(2) We proposed a batch fast algorithm for cost sensitive support vector machine. Similar to cost insensitive support vector machine, cost sensitive support vector machine try to solve a quadratic programming problem. Thus it can be solved with SMO algorithm. We first give the derivation process of SMO algorithm for cost sensitive support vector machine and analyze the time complexity. Then we propose an algorithm framework which can speed up the SMO algorithm using stochastic gradient descent method. After that, we design a contrast experiment to verify the effectiveness of the algorithm framework proposed. And with the analysis of several key indicators of algorithm, we can confirm the analysis of time complexity above.(3) In order to handle the online learning application, we propose multiple incremental fast algorithm for cost sensitive support vector machine. Batch algorithms cannot handle online learning problem well because they should re-train all the samples existed when some samples are added into the training dataset. However, incremental algorithms can absorb added samples and update existed model without re-train all the samples from scratch. We first derivate the multiple incremental cost sensitive support vector machine. Then we illustrate the effectiveness and efficiency of the algorithm through the experimental study. In addition, we also analyze the underlying reasons of the efficiency with further experiments.

Keywords/Search Tags:

Support Vector Machine, Cost Sensitive Learning, SMO, Stochastic Gradient Descent, Online Learning

PDF Full Text Request

Related items

1	Dynamic Regret Of Online Gradient Descent:Analyses And Applications
2	A Research Of Stochastic Gradient Descent Algorithm
3	Imbalanced Stochastic Gradient Descent Online Algorithm For SVM
4	Research On Cost-Sensitive Machine Learning Based On Dynamic Cost
5	Research On Improving The Convergence Performance Of Stochastic Gradient Descent In Distributed Machine Learning
6	The Cost-sensitive Support Vector Machine Supervised Learning
7	Research On Optimization Theory And Algorithms For Online Learning
8	A Research And Application On Stochastic Gradient Descent Algorithm In Distributed Cluster
9	Research On Privacy Preserving Methods In Stochastic Gradient Descent
10	Research On Support Vector Machine Based On Improved Loss Function