Font Size: a A A

Research On Margin Distribution Based Boosting Algorithms

Posted on:2013-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:G X GuoFull Text:PDF
GTID:2248330362970891Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Boosting is a powerful meta-technique to learn an ensemble of weak models with a promise ofimproving the classification accuracy. AdaBoost has been taken as the most successful Boostingalgorithm. Many empirical observations show that AdaBoost seems to be immune to overfitting inmany datasets. The generation error keeps decreasing as its iteration number increases. Much of thework has been down to try to give an interpretation of this phenomenon, among which, the mostinfluential work is Schapire’s margin theory.Schapire’s margin theory provides a theoretical explanation to the success of boosting-typemethods and manifests that a good margin distribution of training samples is essential for generation.However the statement that a margin distribution is good is vague, consequently, many recentlydeveloped algorithms try to generate a margin distribution in their goodness senses for boostinggeneralization. In this thesis, our researches are mainly focused on margin distribution basedalgorithms, and our contributions are summarized as follows:1. We first review some typical boosting-type algorithms, such as AdaBoost, L2Boost, LPBoost,AdaBoost-CG and MDBoost, and simultaneously take a close look at the relationship betweentheir parameters and the margin distribution obtained, and how the margin distribution affects thegeneralization performance.2. We proposed an alternative boosting algorithm termed Margin Distribution Controlled Boosting(MCBoost), which directly controls the margin distribution by introducing and optimizing a keyadjustable margin parameter. MCBoost’s optimization implementation adopts the columngeneration technique to ensure fast convergence and small number of weak classifiers involved inthe final MCBooster.3. We modified the classical SVM algorithm by introducing the idea of margin control, andimproved the sparsity of SVM. The modified algorithm is easier to be applied in multi-catalogclassification area, the complexity of which is just the same as solving binary SVM problems.
Keywords/Search Tags:Boosting, Margin distribution, Margin control, Generalization, Support Vector Machine
PDF Full Text Request
Related items