Font Size: a A A

ThunderSVM:A Fast Parallel Support Vector Machine Library

Posted on:2020-12-14Degree:MasterType:Thesis
Country:ChinaCandidate:J S ShiFull Text:PDF
GTID:2428330590460691Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Support Vector Machines(SVM)are classic supervised learning models for classification,regression and distribution estimation and are widely used to solve realistic problems in data analysis.LIBSVM is the most widely used SVM library,and it is used by many machine learning frameworks as the backend implementation.However,SVM training and prediction are very expensive computationally for large and complex problems.LIBSVM is only optimized for single-core CPU in early years and supports limited parallel optimization.With the rapid growth of data,LIBSVM does not meet the requirements for debugging and application for common large problems due to the training speed.Therefore,many researchers have been working on accelerating SVMs using high-performance hardware such as Graphics Processing Units(GPUs).However,parallelizing SVMs still has the following challenges:(1)The SVM training has random access on the whole training set repeatedly,resulting high-latency access and repeatedly computation;(2)Existing SVM training algorithms are proposed on single-core processor,not considering the multi or many core processors;(3)Multi-class SVMs need to train multiple binary SVMs,but training these binary SVMs in parallel usually requires much larger memory footprint than modern main memory.To overcome the challenges,this paper designed a fast and open-sourced SVM library,ThunderSVM exploiting GPUs and multi-core CPUs,using techniques of SMO with working set buffer,reducing the repeated random access and computation,using kernel value and support vectors sharing strategy,which compresses the memory in multi-class SVM training and prediction.Experimental results show that ThunderSVM outperforms LIBSVM by two order of magnitudes,and GPU baseline by 5 times.Also,ThunderSVM produces the same SVM model as LIBSVM with same accuracy.ThunderSVM supports all functionalities—including classification(SVC),regression(SVR)and one-class SVMs—of LIBSVM and uses identical command line options.ThunderSVM can be used through multiple language interfaces including Python,MATLAB and R.ThunderSVM is released at the world largest open-sourced code platform(https://github.com/Xtra-Computing/ThunderSVM).Till April 2019,ThunderSVM has been attracting more than 900 stars and more than 130 forks,which implies that it is very popular in researchers.
Keywords/Search Tags:Support Vector Machines, Graphics Processing Units, High Performance Computing, Machine Learning Systems
PDF Full Text Request
Related items