Font Size: a A A

A Study On Large Scale Nonlinear Support Vector Machines

Posted on:2019-04-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y D LiFull Text:PDF
GTID:2428330593951074Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Support vector machine(SVM)is a supervised machine learning algorithm based on statistical theory.Due to its excellent learning performance,SVM has become the focus of machine learning recently.It has many unique advantages in solving small sample,nonlinear and high dimensional pattern recognition problems,and effectively overcomes overfitting problems,so it is widely used in pattern recognition,regression analysis,function estimation,time series forecasting and other fields.However,with the increase of massive data caused by the era of big data,how to deal with large scale nonlinear support vector machine model efficiently and avoid the curse of dimensionality become the focus of research.This paper focuses on how to deal with large scale nonlinear support vector machines to improve the algorithm and application research.First of all,this paper reviews the development of the basic theories and algorithms of SVM,and leads to the research background and significance of this paper,and then analyzes the problems encountered in the development of the algorithm.Secondly,this paper describes two theoretical algorithms of SVM,and introduces the application of Stochastic Gradient Descent(SGD)algorithm and kernel function in the nonlinear SVM algorithm.Finally,we propose a new improved algorithm to solve large scale nonlinear support vector machines.The algorithm is improved from Support Vector Classification(SVC)and Support Vector Regression(SVR)respectively.In the large scale nonlinear SVC algorithm,we propose an efficient support vectors reduced strategy(SRS)based on kernel similarity is proposed.In each SGD iteration in the nonlinear SVC model,if a sample is recorded as a support vector,the kernel function calculates the kernel similarities of the sample and the set of support vectors,then when the kernel similarity is greater than the predefined threshold,the SRS will reduce this support vector to improve the efficiency of the model.In addition,we further accelerate the training process by combining the SRS strategy with other budget maintenance strategies.In the large scale nonlinear SVR algorithm,we extend Budget SGD and SRS to SVR respectively that are applied for large scale nonlinear SVC.The experimental results of different algorithms on SVC and SVR models show that the proposed SRS strategy based on kernel similarity and the budget SGD strategy presented in these two models can significantly reduce the training time of the model while achieving competitive accuracy,to a certain extent,solves the curse of kernelization of dealing with large scale nonlinear support vector machines.
Keywords/Search Tags:Support vector classification, Support vector regression, Stochastic gradient descent, Kernel similarity, Budget maintenance strategy
PDF Full Text Request
Related items