Font Size: a A A

Research On Ensemble Learning Integrated With Non-labeled Sample Selection

Posted on:2016-06-05Degree:MasterType:Thesis
Country:ChinaCandidate:H QinFull Text:PDF
GTID:2308330464471555Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology, people’s ability to obtain date is greatly enhanced. The large number of data information yielded in every walk of life can be collected and saved through data acquisition system and computer. Especially in recent years, it’s easier to get a lot of non-labeled sample date than labeled sample date with the development of advanced technology, data acquisition and memory technology. It’s often unpractical and time-consuming for people to label the non-labeled samples.Under this realistic circumstance, the traditional supervised learning model is no longer practical while the researches on how to use a few amount of labeled samples and a large number of non-labeled samples in learning made more and more researchers interested. The key to make use of the massive non-labeled samples of the real world is to present a new learning model which can make full use of these non-labeled samples or a few hand-classified samples. While these models, provided by the current mainstream semi-supervised learning and active learning strategies, still have some problems, such as low classification accuracy, large amount of computation and too much training time. Therefore, enhancing the efficiency of learning with the non-labeled samples is still the research hotpot and difficult point of this field. For this purpose, some researches on the improvement of active learning and semi-supervised learning model which combined with ensemble learning are made in this thesis, the main works are as follows:(1) Research results on non-labeled sample selection methods of international community in recent years are summarized, and the merits and demerits existed in the active learning and semi-supervised learning strategies are analyzed.(2) A high effective learning method integrated with active learning and ensemble learning based on divergence evaluation is proposed here in which the training is divided into two phases----pre-training and post-training. According to sample divergence and different training stages, this method uses various non-labeled sample selection strategies to reduce the impact of prophase miscalculation on learning accuracy. To evaluate the performance of this method, tests have been done on artificial date stream and HEp-2 cell image date. The experimental results show that this method needs less training samples and is able to acquire higher classification accuracy over the current Qboost method.(3) An extreme ensemble learning method based on semi-supervised learning strategy is proposed. Combined with the advantages of less labeled samples required strategies of semi-supervised learning and the advantages of accuracy and robustness of ensemble learning, this method has improved the classification accuracy and significantly reduced the training time through extreme learning machine’s training classifiers. To assess the validity of this method, the same date set is tested. The experimental results show that this method has considerable accuracy as the method proposed in(2), and needs less training time than the current mainstream methods, including the proposed method in(2).
Keywords/Search Tags:date stream, active learning, semi-supervised learning, ensemble learning, extreme learning machine
PDF Full Text Request
Related items