Font Size: a A A

Enabling Support Vector Machines To Work For Big Data

Posted on:2020-12-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y SongFull Text:PDF
GTID:2428330578480102Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet,artificial intelligence and data mining have become new research hotspots and commercial hotspots.Among many data mining algorithms,a Support Vector Machine has been applied to various fields as a machine learning algorithm with completely theory and less over-fitting.With the intensive study of SVM,people have gradually found that there are still many places where support vector machine needs to be optimized and improved.This paper mainly studies two defects of support vector machine.One is the parameter optimization problem of support vector machine.Another is that is takes too long for support vector machine to train in the face of large sample dataset.To make up for the second shortcoming,we try to improve it on the basis of Cascade SVM.The main contents of this paper are given as follows:(1)From the perspective of SVM parameter optimization,based on the standard genetic algorithm optimization SVM parameters(GA-SVM),the parameter optimization of SVM based on multi-population genetic algorithm(MPGA-SVM)is proposed.MPGA-SVM breaks through the evolutionary framework of simple genetic algorithm,and introduces multiple populations and performs optimized search at the same time,by taking into account the balance between the algorithm global search and local search ability.The experiment proves that MPGA-SVM has better convergence effect and classification accuracy than GA-SVM.(2)This paper also proposes a Powell genetic algorithm(Powell-GA-SVM)combining mathematical optimization algorithm and genetic algorithm,which makes the conventional disordered parameter optimization to become directional.Powell-GASVM makes it possible to inherit the advantages of genetic algorithm in global search,and also has the strong local search ability of Powell algorithm.Experiments show that compared with GA-SVM,Powell-GA-SVM also has better convergence effect and classification accuracy.The accuracy and convergence effect of Powell GA-SVM is the best among the three parameter optimization algorithms.(3)From the perspective of CascadeSVM,we study the training time for large sample data by support vector machine.Aiming at the problem that the support vector of CascadeSVM does not contribute much to the latter level,an algorithm combined with ensemble learning(VoteCascadeSVM)is proposed.This paper also reproduces the improved CascadeSVM algorithm and compares it in a large number of experiments.By comparing with the five benchmark algorithms,it is found that the proposed Vote CascadeSVM has great advantages in classification accuracy and training time,and in particular the larger the data size is,the more obvious the advantage is.
Keywords/Search Tags:support vector machine, parameter optimization, genetic algorithm, multi-population genetic algorithm, Powell, CascadeSVM
PDF Full Text Request
Related items