Methodologies And Applications For Solving Large-scale Support Vector Machines

Posted on:2022-11-23

Degree:Master

Type:Thesis

Country:China

Candidate:B S Li

Full Text:PDF

GTID:2518306752982589

Subject:Applied Statistics

Abstract/Summary:

PDF Full Text Request

The traditional method of solving Support Vector Machine(SVM)encounters the problem of low storage and computational efficiency of the kernel matrix in the face of the increasing amount of massive data.The focus of this paper is to propose a method that can overcome storage overflow and solve large-scale SVM models efficiently.The specific research work is as follows:(1)Stochastic Gradient Descent(SGD)is proposed to solve large-scale SVMs.Firstly,based on the convexity of SVM unconstrained problems,an algorithm is designed for SGD to solve large-scale SVMs.Secondly,the convergence properties of SGD are verified based on the two cases of smooth and non-smooth functions,respectively.Finally,the designed algorithm is applied to microarray classification,and the experiments show that the method improves both the accuracy and the convergence efficiency of microarray classification compared with the traditional classical algorithm.(2)Shared Memory Parallel SGD(SMP-SGD)is proposed to solve large-scale SVMs.Firstly,we investigate various parallel computing models and build a shared memory model suitable for SGD parallelization.Second,the SMP-SGD algorithm is designed using the Message Passing Interface(MPI),and a parallel version of the Adaptive Stochastic Gradient Descent(ASGDs)algorithm for solving large-scale SVMs is designed on top of this algorithm.Finally,these two algorithms are applied to the large-scale brain tumor image detection work,and the experiments show that SMP-SGD and SMP-ASGDs algorithms not only improve the computational efficiency but also ensure the accuracy of brain tumor image detection compared with SGD.(3)Bulk Synchronous Parallel SGD(BSP-SGD)based on Spark parallel framework is proposed to solve large-scale SVM.Firstly,we investigate Spark's in-memory computing and Yarn resource scheduling methods,and analyze the BSP computing model.Second,the Spark-based BSP-SGD algorithm is designed to solve large-scale SVMs.Finally,the algorithm is applied to large-scale text data classification,and the experiments show that the method not only overcomes the memory overflow problem and improves the computational efficiency of the model,but also ensures a high accuracy.

Keywords/Search Tags:

Large-scale support vector machines, Stochastic gradient descent, Parallel computing, Shared memory, Bulk synchronous parallel

PDF Full Text Request

Related items

1	A Study On Large Scale Nonlinear Support Vector Machines
2	Parallel Stochastic Gradient Descent Algorithm On Large-scale High-dimensional And Sparse Data
3	Imbalanced Stochastic Gradient Descent Online Algorithm For SVM
4	The Research Of Distributed Parallel Support Vector Regression Machine Algorithm And Framework
5	A Study On The Fast Training Methods Of Support Vector Machines Based On Coordinate Descent
6	Research Of Stochastic Parallel Gradient Descent Based On Segmentation Random Disturbance
7	A Research Of Stochastic Gradient Descent Algorithm
8	Research On Triangle Computation Of Large-scale Graphs
9	Parallel support vector machines for multi-category classification of large scale data
10	Research On Stochastic Coordinate Algorithm Of Support Vector Machines And Robust Support Vector Machines Under The Background Of Big Data