Research On Parallel Computing Of Support Vector Machines Based On Improved Stochastic Gradient Descent And Its Application

Posted on:2024-01-21

Degree:Master

Type:Thesis

Country:China

Candidate:Q H Qiu

Full Text:PDF

GTID:2558307073476574

Subject:Applied statistics

Abstract/Summary:

PDF Full Text Request

Support Vector Machine(SVM)requires a lot of training time and memory space when dealing with large-scale data sets;using stochastic gradient descent iterative solution and applying it to a parallel environment can reduce the training time and memory space.This thesis takes this as an entry point to explore a better way to solve the problem,and the main work is the following two aspects:(1)Proposed to use Improved Weighted Linear Stochastic Gradient Descent(IWLSGD)to solve the Support Vector Machine(IWLSGD-SVM)from the original problem perspective.By introducing a penalty function,the constrained original problem of the support vector machine is converted into an unconstrained problem,the unconstrained form of the support vector machine is a convex optimization problem,which can be solved iteratively using an optimization algorithm.However,the traditional way to solve SVM requires a lot of matrix calculation,which is time-consuming.In order to improve the computational speed,this thesis designs the Linear Stochastic Gradient Descent(LSGD)to solve the support vector machine algorithm(LSGD-SVM).To improve the classification accuracy,the LSGD-SVM algorithm is improved by correcting the classification hyperplane using a weighting approach since most of the data distribution in the actual problem is not balanced.Among them,the weighting design considers the relative number of samples in both categories to avoid extremes and one-sidedness to a certain extent.The designed algorithm is applied to test data,and the experiments show that the stochastic gradient solving support vector machine is faster than the traditional solving method.The IWLSGD-SVM algorithm proposed in this thesis outperforms the LSGD-SVM in terms of classification accuracy and time performance.(2)Proposed parallel computing model based on Spark framework.First,a distributed platform is built on a single computer,configured with cluster files,and connected to external compilers.Then,in order to improve the computational efficiency,this thesis designs a batch synchronous parallel IWLSGD-SVM model.In the model,data parallel mode is adopted,and synchronous parallel mode is selected as the communication type.Finally,two groups of wind turbine data with different data levels are used to verify the effectiveness of the batch synchronization parallel algorithm.The experiment shows that the classification accuracy of this method is higher than that of the single machine mode in large-scale datasets,and the time consumption is shorter,achieving the effect of higher solution efficiency.

Keywords/Search Tags:

Stochastic Gradient, Weight, Support Vector Machines, Parallel computing, Unbalanced data

PDF Full Text Request

Related items

1	Imbalanced Stochastic Gradient Descent Online Algorithm For SVM
2	Methodologies And Applications For Solving Large-scale Support Vector Machines
3	A Study On Large Scale Nonlinear Support Vector Machines
4	Research On Some Problems And Applications In Support Vector Machines
5	Studies Of Several Mathematical Models And Algorithms Of Support Vector Machine
6	Research On Traditional Classification Model Based On Unbalanced Data
7	Research On Stochastic Coordinate Algorithm Of Support Vector Machines And Robust Support Vector Machines Under The Background Of Big Data
8	Research On Parallel Text Classification Method Based On Support Vector Machine
9	The Research And Optimization On Support Vector Machines Algorithm
10	Classification Methods Based On Support Vector Machines And Manifold Learning