A Linear Separable Support Vector Machine For Large Samples

Posted on:2019-12-11

Degree:Master

Type:Thesis

Country:China

Candidate:Y Qiao

Full Text:PDF

GTID:2417330566976962

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

With the explosion of industry data,the concept of big data has been greatly improved.Due to the large amount of large data and complex and diverse features,traditional support vector machine classification algorithms are no longer applicable in big data environments.Therefore,The research of the SVM classification algorithm under big data has become the direction of close attention from all walks of life.In order to be able to apply the SVM to rapid classification of massive sample data,it is necessary to filter potential support vector sets from large sample data sets as a training set of SVM to improve learning efficiency.Because of the large sample size,the complexity of training SVM will increase dramatically and consume a large amount of training time,which makes it difficult for the SVM to be adopted in massive sample data learning.The separation hyperplane of the support vector machine is determined by the support vector,and the other training sample points have no effect on the determination of the separation hyperplane.This article will reduce large-scale data to small-scale data,learn support vectors on small-scale data and iterate to get the final support vector.This paper proposes a linear separable SVM grouping algorithm,this algorithm randomly divides large samples into several groups of small-sample training data sets.Training is performed on small-sample training data sets to obtain potential support vectors.The potential support vectors is added to the next group for training,and so on.The support vector obtained from the last group of training is the support vector of the large sample data set.Secondly,a misclassification sample preselection algorithm is proposed.The algorithm is based on the decisive role of the support vector for separating the hyperplanes.In a large number of the training sample data set to remove from separation hyperplane of sample points,and the suspect samples are extracted and trained with these suspect samples.Support vector machines not only use the useful information of all the samples,but also save the training time of the support vector machine and greatly improve the training efficiency.The experimental results show that the two algorithms proposed in this paper are exactly the same as the support vector obtained by convex quadratic programming,which reduces the learning difficulty and running time of the support vector machine and has real-time and high efficiency.

Keywords/Search Tags:

Support vector machine, separation hyperplane, large sample, grouping algorithm, misclassification sample preselection algorithm

PDF Full Text Request

Related items

1	A Class Of Online Algorithms For Support Vector Machine And Their Application
2	Research On The Improvement Of Sample Weighted Method In AdaBoost Algorithm
3	Inverse Distance Weighted Support Vector Machine On High-Dimension Low-Sample Size Data And Class-Imbalance Data
4	Mobile Based Student Council Voting System Case Of Federal Technical Institute(FTI)
5	A Research On Learning Process Evaluation Based On Support Vector Machine
6	Statistical Inference And Algorithm Design Of Mixture Parameter Model With Change Point
7	Evaluation Of Teaching Quality Based On Multi-Classification Algorithm Of Support Vector Machine
8	Research On Support Vector Machine And Decision Tree Algorithm For Imbalanced Datasets
9	Human-machine Chess Intelligent Control System Based On Plane Constraints
10	One-sample And Two-sample Testing Methods For High-Dimensional Covariance Matrices