Research On Parallel SVM Computing Based On Heterogeneous Cluster

Posted on:2019-01-01

Degree:Master

Type:Thesis

Country:China

Candidate:R H Zhang

Full Text:PDF

GTID:2348330569995460

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

SVM is endorsed with merits of the complete theoretical basis,the good performances obtained in practical applications,especially the outstanding ability on multi-classification problems and successful example of using kernel functions and is widely applied in human facerecognition,time series prediction,adaptive signal processing,pattern recognition and so on.The essence of the SVM training phase is to solve quadratic programming(QP)problems and obtain support vectors.The N-order matrix must be calculated during the QP problem(N is the number of samples).With the advent of the era of big data,to deal with many sample categories,massive samples,and large number of sample features,SVM suffers from problems in both memory use and computational time.In order to solve these problems,this dissertation focuses on the parallelization of SVM and studies parallel techniques to decentralize SVM memory dependence and accelerate the training process.Lib SVM training process automaticly sets kernel function parameter called Gamma and punishment fault-tolerant parameter called C,which is not conducive to improving the training accuracy and generalization ability of the training model.Many scholars at home and abroad use hybrid programming(such as MPI+CUDA)to accelerate the the two-class SVM training process,but real application needs to deal with many types of samples;The mixed programming also requires high qualification on developing software.The main work of the thesis is as follows:1.Adding parameters optimization function Based on the LIBSVM to improve the training accuracy and generalization ability of the training model.2.After investigating the literature on parallelized SVM and analyzing the advantages and disadvantages of the existing big-data programming model;This paper proposes a distributed heterogeneous heterogeneous parallelization scheme.The heterogeneous cluster has good scalability;Can increase or decrease the number of different computing resources(such as CPU,GPU,DSP)and change the program code less;Easy for secondary development and does not require programmers to master parallel programming technologies;Adaptable to multiple application scenarios;computational performance increases nearly linearly with computing machines.3.In order to speed up the training speed of support vector machine,through the study of the SVM model,combined with the SVM serial implementation,Comparing four parallel implementations of SVM and finally selecting the svm_train_one parallel acceleration scheme;Clarifying the code structure and finding the hot spots in the training process helps facilitate parallelization;The svm_train_one parallel acceleration program for CPU and GPU were written and registered to the cluster framework.4.Building an experimental environment based on the Alibaba Cloud and using two sets of samples for performance testing.The experimental results show that the svm_train_one parallel program based on heterogeneous clusters has a significant improvement in the training speed of SVM,which has important practical value.

Keywords/Search Tags:

support vector machine, cluster, heterogeneous computing, parallelization

PDF Full Text Request

Related items

1	Min-Max Modular SVM Based On Cloud Platform
2	FPGA Heterogeneous Computing And Application Of Predictive Controller Based On Support Vector Machine
3	Research Of Parallel Computing Apply In Support Vector Machine Under MATLAB Cluster
4	Adaptive Scheduling Using Support Vector Machine on Heterogeneous Distributed Systems
5	Massive Text Classification Parallelization Technology Based On Support Vector Machine
6	Design Of Support Vector Machine Accelerator Based On Reconfigurable Computing Platform
7	Research On Some Problesm Of Support Vector Machine Learing Algorithm
8	Research On Support Vector Machine Technology In Biologic Data Analyses
9	Recognition Based On Support Vector Machines For Speaker
10	Research On Parallelization Of Machine Learning Algorithms For On-chip Heterogeneous Multi-core Systems