Font Size: a A A

Parallel Support Vector Machine Based On The Cloud Platform

Posted on:2015-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:L P LiFull Text:PDF
GTID:2298330467463095Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Nowadays, with the rapid development of computer technology, the growing popularity of Internet applications, the rapid development of social tools, there are a lot of information and data that be produced. The traditional data analysis methods cannot solve the problem, so, the large scale data distribution cloud Platform have becomed a new way to solve large-scale data analysis. Currently, the management and retrieval for large-scale data sets, mainly use the classification of data mining and predictive analysis capabilities. Many of the classic classification algorithm has been applied, and achieved good results.In this paper, based on in-depth study of the support vector machine, combined with sub-block method and cascade, proposed parallel SVM based Map Reduce mechanisms. According Map Reduce implementation mechanism features, a similar approach to the cascade SVM training data for processing. First, parallel SVM method need a simple pre-processing of data sets, making SVM training data set is relatively uniform distribution of each category, to avoid training results cannot be obtained in the extreme cases when training SVM case; Second, parallel SVM training go through an iterative cascade training, the training algorithm in order to avoid non-stop run down, or overtraining, designed to deal with the conditions corresponding iterative parallel SVM training algorithm training and to design and implement the control of the whole parallel SVM training schedule the way. Combining Hadoop Map Reduce computing storage characteristics and features of HDFS, according to the characteristics of the corresponding operator, parallel SVM method is reasonable and correctness, are given to prove the validity and experimental validation, while the realization of the parallel SVM algorithm is applied to actual projects, classification and prediction based on large data sets support vector machine.
Keywords/Search Tags:parallel, SVM Hadoop, MapReduce
PDF Full Text Request
Related items