Font Size: a A A

Generalization Performance Of Support Vector Machine Distributed Ensemble Learning Based On Markov Sampling

Posted on:2022-04-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:H W JiangFull Text:PDF
GTID:1480306536486544Subject:Basic mathematics
Abstract/Summary:PDF Full Text Request
The strategy of divide-and-conquer is a basic strategy for big data processing,especially with the great development of distributed in recent years,divide-and-conquer strategy becomes more and more important.However,redundant or noisy samples in big data not only waste storage space,but also affect the computational efficiency and accuracy of machine learning algorithms.Therefore,there is an urgent need for selective sampling or resampling in the big data environment.In this paper,we systematically study support vector machine(SVM)distributed ensemble learning based on Markov sampling by use SVM as a breakthrough point.The specific research work is summarized as follows:1.The generalization performance of support vector machine(SVM)ensemble learning based on uniformly ergodic Markov chain(u.e.M.c.)samples is studied,and the optimal learning rate is established.On the basis of theoretical research,we propose two SVM ensemble learning algorithms based on Markov resampling.Numerical studies on public data sets show that compared to the classical ensemble algorithm,the two SVM ensemble learning algorithms based on Markov resampling proposed in this paper have smaller misclassification rates,less total time of sampling and training.2.The generalization bound of SVM distributed learning based on u.e.M.c.samples is established,and the optimal convergence rate is obtained.This paper proposes a new SVM distributed learning algorithm based on Markov sampling,the numerical studies on public data sets show that compared to the classical SVM distributed algorithm,the proposed SVM distributed algorithm based on Markov sampling has higher accuracy,less total time of sampling and training.3.Since the regularized hyperparameter tuning of SVM is usually time-consuming in big data environment,in this paper,the SVM algorithm based on non-regularized hyperparameter tuning is proposed.Numerical studies on public data sets show that compared to the classical SVM algorithm based on regularized hyperparameter tuning,the SVM algorithm based on non-regularized hyperparameter tuning has higher accuracy,less total time of sampling and training.As an application,we also study the generalization performance of the SVM distributed learning based on non-regularized hyperparameter tuning.
Keywords/Search Tags:Ensemble learning, Distributed learning, Support vector machine, Markov sampling, Generalization performance
PDF Full Text Request
Related items