Font Size: a A A

Research On Ensemble Learning

Posted on:2008-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:F ZhouFull Text:PDF
GTID:2178360242976753Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Recently, Ensemble Learning has been developed into a popular method in the area of machine learning. Compared with direct-learning approaches, it focuses on a framework which has the capability of combining various classifiers. It usually takes a step to decompose the problems first. By evaluating the connections among the learned classifiers, the global concept could be obtained from the partial ones.The most appealing features of problem decomposition could be categorized into the following aspects: First, as the scale and the number of samples grow in an intensive level, the traditional classifiers have no corresponding principals to couple with. Therefore, the way to decompose the large-scale problem into several small ones is considered as an appropriate solution; Then, it has been noticed the designers proposed their classifiers with a single specific consideration. If the problem lies in the right hypothesis space, the performance will naturally be excellent. In most cases, however, it is unreasonable to make such a strong assumption. A more proper way is to understand the problem in different views; Besides, The existence of noise is unavoidable in the real environment. Without distinguishing them, we will obtain an over-fitting model otherwise. We need a proper way here to identify the noisy parts which have to be discarded in the decision procedure.Ensemble learning have been successively applied into the multi-class problems. It mainly arranges the samples into clusters according to the class boundaries, which are then combined via the one-vs-one or one-vs-all ways. The final labels could be decided in a voting procedure. Especially, within an assumed probabilistic models, the relationships among the classifiers would be precisely measured (e.g. the KL distance). Unfortunately, a probability-output is not promised for most classifiers. In order to keep the core algorithms as much as possible, a Sigmod curve would be approximated instead.In the past works, M~3 has been proved as an excellent system in the large-scale and imbalanced cases. Differed with the ensemble framework only for multi-class problems, it continually splits a still hard pairwise problem. The solution of original problem will be restored by a pair of principals called Min-Max.We concluded that a reasonable cut along some prior-known edges is more inclined to obtain a better performance in later classification. However, the theoretic explanation is still far to understand well. In this article, we hope to apply the statistics knowledge in a theoretic description. The sample set is generated by some underlying probability distribution. Maybe the whole distribution is hard to learn, its parts will be more easily to understood. According to the well-known Bayes decision theory, we could obtain the optimum prediction. Our new formula shows the Min-Max principals are just equivalent forms used the classifiers which report 0-1 outputs. Moreover, the combination of the sub-sets of samples could be extended into a much wide scope, in which a fast algorithm is then proposed. The experimental results show both the time and space complexity could be reduced into linear level.
Keywords/Search Tags:min-max modular network, large-scale and imbalanced data classification, ensemble learning, Bayes decision, support vector machine
PDF Full Text Request
Related items