Font Size: a A A

Ensemble Strategy Based On Min-Max Rule

Posted on:2016-03-11Degree:MasterType:Thesis
Country:ChinaCandidate:G J ZhouFull Text:PDF
GTID:2308330473965467Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Ensemble learning has become one of the hot spots in machine learning and data mining since 1990 s. It constructs a strong learner with better generalization by combining different weak learners. It worked well in the field of both classification and feature selection. Therefore, this study is focus on some research on the ensemble learning in classification and feature selection, especially a classical ensemble strategy based on Min-Max rule..An ensemble strategy based on Min-Max rule for classification is introduced firstly. To deal with the classification problem in the field of biology and medicine, it is compared with three other well-known ensemble strategies: Bagging, AdaBoost and random subspace. The experiments on some real-world datasets, including medical image recognition, cancer diagnosis and protein localization sites, are designed later. Results show that: firstly, when compared with using a single classifier, ensemble learning can truly improve the classification accuracy. Secondly, Bagging and AdaBoost can make an improvement all the time but not observably. Thirdly, random subspace strategy works better when dealing with the problem with high dimension, while Min-Max rule based strategy with less features. Lastly, as the number of features we selected increasing in such classification problems, the classification accuracy may decrease.In order to deal with large scale problems in feature selection, ensemble feature selection using Min-Max rule is proposed there. It decomposes the original data into a group of relatively smaller balanced ones according to their labels, and then combining the different results of sub-problems by Min-Max rule. The experiments are designed to compare Min-Max rule based ensemble method with three other strategies on some real-world datasets. Results show that Min-Max rule based method is superior to other ones in most case especially when the number of selected features is small. Besides, feature redundancy analysis is integrated into the proposed Min-Max rule based ensemble model. It detects the subset of features which have a strong correlation in a set of relevant features, and then eliminates the maximum number of redundant features according to maximum spanning tree. The results on both real-world and synthetic datasets show that the classification performance can be truly improved with less redundant features.
Keywords/Search Tags:ensemble learning, Min-Max rule, feature selection, ensemble feature selection
PDF Full Text Request
Related items