Font Size: a A A

Research On Parallel SVM Computing Based On Heterogeneous Cluster

Posted on:2019-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:R H ZhangFull Text:PDF
GTID:2348330569995460Subject:Engineering
Abstract/Summary:PDF Full Text Request
SVM is endorsed with merits of the complete theoretical basis,the good performances obtained in practical applications,especially the outstanding ability on multi-classification problems and successful example of using kernel functions and is widely applied in human facerecognition,time series prediction,adaptive signal processing,pattern recognition and so on.The essence of the SVM training phase is to solve quadratic programming(QP)problems and obtain support vectors.The N-order matrix must be calculated during the QP problem(N is the number of samples).With the advent of the era of big data,to deal with many sample categories,massive samples,and large number of sample features,SVM suffers from problems in both memory use and computational time.In order to solve these problems,this dissertation focuses on the parallelization of SVM and studies parallel techniques to decentralize SVM memory dependence and accelerate the training process.Lib SVM training process automaticly sets kernel function parameter called Gamma and punishment fault-tolerant parameter called C,which is not conducive to improving the training accuracy and generalization ability of the training model.Many scholars at home and abroad use hybrid programming(such as MPI+CUDA)to accelerate the the two-class SVM training process,but real application needs to deal with many types of samples;The mixed programming also requires high qualification on developing software.The main work of the thesis is as follows:1.Adding parameters optimization function Based on the LIBSVM to improve the training accuracy and generalization ability of the training model.2.After investigating the literature on parallelized SVM and analyzing the advantages and disadvantages of the existing big-data programming model;This paper proposes a distributed heterogeneous heterogeneous parallelization scheme.The heterogeneous cluster has good scalability;Can increase or decrease the number of different computing resources(such as CPU,GPU,DSP)and change the program code less;Easy for secondary development and does not require programmers to master parallel programming technologies;Adaptable to multiple application scenarios;computational performance increases nearly linearly with computing machines.3.In order to speed up the training speed of support vector machine,through the study of the SVM model,combined with the SVM serial implementation,Comparing four parallel implementations of SVM and finally selecting the svm_train_one parallel acceleration scheme;Clarifying the code structure and finding the hot spots in the training process helps facilitate parallelization;The svm_train_one parallel acceleration program for CPU and GPU were written and registered to the cluster framework.4.Building an experimental environment based on the Alibaba Cloud and using two sets of samples for performance testing.The experimental results show that the svm_train_one parallel program based on heterogeneous clusters has a significant improvement in the training speed of SVM,which has important practical value.
Keywords/Search Tags:support vector machine, cluster, heterogeneous computing, parallelization
PDF Full Text Request
Related items