Font Size: a A A

Methodologies And Applications For Solving Large-scale Support Vector Machines

Posted on:2022-11-23Degree:MasterType:Thesis
Country:ChinaCandidate:B S LiFull Text:PDF
GTID:2518306752982589Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
The traditional method of solving Support Vector Machine(SVM)encounters the problem of low storage and computational efficiency of the kernel matrix in the face of the increasing amount of massive data.The focus of this paper is to propose a method that can overcome storage overflow and solve large-scale SVM models efficiently.The specific research work is as follows:(1)Stochastic Gradient Descent(SGD)is proposed to solve large-scale SVMs.Firstly,based on the convexity of SVM unconstrained problems,an algorithm is designed for SGD to solve large-scale SVMs.Secondly,the convergence properties of SGD are verified based on the two cases of smooth and non-smooth functions,respectively.Finally,the designed algorithm is applied to microarray classification,and the experiments show that the method improves both the accuracy and the convergence efficiency of microarray classification compared with the traditional classical algorithm.(2)Shared Memory Parallel SGD(SMP-SGD)is proposed to solve large-scale SVMs.Firstly,we investigate various parallel computing models and build a shared memory model suitable for SGD parallelization.Second,the SMP-SGD algorithm is designed using the Message Passing Interface(MPI),and a parallel version of the Adaptive Stochastic Gradient Descent(ASGDs)algorithm for solving large-scale SVMs is designed on top of this algorithm.Finally,these two algorithms are applied to the large-scale brain tumor image detection work,and the experiments show that SMP-SGD and SMP-ASGDs algorithms not only improve the computational efficiency but also ensure the accuracy of brain tumor image detection compared with SGD.(3)Bulk Synchronous Parallel SGD(BSP-SGD)based on Spark parallel framework is proposed to solve large-scale SVM.Firstly,we investigate Spark's in-memory computing and Yarn resource scheduling methods,and analyze the BSP computing model.Second,the Spark-based BSP-SGD algorithm is designed to solve large-scale SVMs.Finally,the algorithm is applied to large-scale text data classification,and the experiments show that the method not only overcomes the memory overflow problem and improves the computational efficiency of the model,but also ensures a high accuracy.
Keywords/Search Tags:Large-scale support vector machines, Stochastic gradient descent, Parallel computing, Shared memory, Bulk synchronous parallel
PDF Full Text Request
Related items