Research On Large-scale Support Vector Machine Training

Posted on:2011-09-07

Degree:Master

Type:Thesis

Country:China

Candidate:S D Liu

Full Text:PDF

GTID:2178330338976298

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

For high-dimensional data, there is usually no obvious improvement on the predict performance of decision function by using kernel methods to map input samples to higher-dimensional space. While linear SVM(Support Vector Machine) can provide prominent predict performance for such kind of data. Each component of the weight vector can be computed from the lagrange multiplier and training sample directly when using linear kernel. Direct Representation of weight decreases the computational cost of dot product between weight vector and training sample when there are lots of support vectors. Based on minimum enclosing ball, this paper solves a variant of classical SVM problem. The proposed method stores weights vector w and the common factor of every component, and only needs to update a small part of the weight vector. The time complexity isΟ( md), where d is the average number of non-zero attributes and m is the size of core set. Experiments show that on some datasets the proposed method is faster than state-of-the-art linear SVM solver ,such as DCD (Dual Coordinate Descend) method and SVMperf based on cutting plane.An efficient parallel SVM training algorithm is proposed to process large-scale nonlinear SVM optimization problem. It decomposed the whole problem to a series of subproblems which are processed independentlty at different nodes. When training a local SVM each node doesn't need to communicate with each other. Combination of solutions of subproblems needs to compute dot product of weight vectors(whose component cannot be computed directly) of different nodes, which needs high computional cost, so at the end of training each node using less reduced vectors to approximate the weight vector. This makes efficiently training large-scale nonlinear SVM possible.

Keywords/Search Tags:

linear SVM, large-scale datasets, minimum enclosing ball, parallel, reduction

PDF Full Text Request

Related items

1	A Study Of Support Vector Classifiers Based On The Theory Of Minimum Enclosing Ball
2	Study On Coupled Supported Vector Method And Its Application
3	Data Domain Description And Sample Reduction Research On Large-scale Datasets
4	The Research On RBF-ELM Two-phase Learning Algorithm
5	Research On Approximate Minimum Enclosing Balls In High Dimensions
6	The Study Of Several Key Issues On Large Data Sets Classification Techniques In Pattern Recognition
7	Research On Parallel Solving Of Large Scale Sparse Linear Systems
8	A Study On Kernel-based Classification And Dimension Reduction And Its Application
9	Application And Research On Clustering Algorithm In Large Scale Datasets
10	Geometric Approximation Algorithms: A Summary Based Approach