Font Size: a A A

Research On The Generalization Performance Of Online SVM Classification Algorithm Based On Markov Sampling

Posted on:2019-03-13Degree:MasterType:Thesis
Country:ChinaCandidate:Q ChenFull Text:PDF
GTID:2428330545957139Subject:Systems analysis and integration
Abstract/Summary:PDF Full Text Request
In the field of machine learning,online support vector machine classification algorithm is one of the most widely used algorithms to deal with large-scale data classification problems.The loss function in the traditional online support vector machine algorithm only depends on one sample and does not take into account the historical input sample data information.Therefore,a support vector machine classification algorithm based on the pairwise loss function emerges.That is,the loss function of the support vector machine classification algorithm accepts pairs of(two)input samples at the same time.Because it can fully preserve and utilize the value information of historical input samples,Efficient online learning with pairwise loss functions is a crucial component in building large-scale learning system that maximizes the area under the Receiver Operator Characteristic(ROC)curve Support vector machines have evolved into key components in building large-scale classification learning systems.But so far,the known research work on the generalization ability of the online support vector machine classification algorithm is almost always based on the assumption that the data(or samples)are independent and identically distributed.However,the assumption that samples are independent and identically distributed is too strong both in theory and practice.In addition,random independent sampling retains a large amount of noise data and the extracted training samples have a low value density.Instead of independent sampling,fewer representative samples were extracted,which greatly improved the learning efficiency and performance of the learning system.In order to explore the effect of non-independent and identical distribution data on the learning performance of this online support vector machine classification algorithm,this paper weakens the assumption that the samples are independent and identically distributed to uniformly ergodic Markov chains,and introduces a new Markov sampling method to study the generalization performance of online support vector machine classification algorithm.An online SVM classification algorithm based on Markov sampling and a online SVM classification algorithm with pairwise loss function based on Markov sampling are proposed.Then the two improved algorithms are studied in numerical experiments.The experimental results show that:The online SVM classification algorithm and the online support vector machine classification algorithm with pairwise loss functions not only have better generalization ability than the random(independent)sampling algorithm,but also have a more stable prediction result and this advantage has become increasingly apparent as the number of training samples increases.
Keywords/Search Tags:online support vector machine, Markov sampling, pairwise loss functions, uniformly ergodic Markov chain
PDF Full Text Request
Related items