Font Size: a A A

Researches About Transfer Learning Algorithm Based On Ensemble Selection Methods

Posted on:2018-11-17Degree:MasterType:Thesis
Country:ChinaCandidate:M L LiFull Text:PDF
GTID:2348330536487937Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Traditional machine learning is actually based on statistical machine learning,a major assumption in it is that the training and testing data must be in the same feature space and have the same distribution.However,in many real-world applications,this assumption may not hold,so that the traditional machine techniques on the solution to these problems are ineffective.In recent years,transfer learning has emerged as a new learning paradigm to cope with this considerable challenge.The most remarkable characteristic of transfer learning is that it can exploit previously learnt knowledge by leveraging information from an old source domain to help learning in a new target domain,making the traditional learning from scratch an addable one.At present,the scholars have proposed many methods to deal with the transfer learning problem,including Support Vector Machine,Artificial Neural Network,etc.Although the researches show that these methods can acquire better performance,the single model is used to solve the problem of transfer learning has some limitations.So the scholars have proposed that solve this problem by ensemble learning methods.However,ensemble learning requires many classifiers,which increase the time and space complexity.And the classifiers of low generalization performance will affect the final classification performance.To deal with this problem,we can select a subset from the original ensemble to construct a new ensemble system.This method is named as ensemble pruning,which is also called selective ensemble.Selective ensemble is a desirable and popular method to overcome the deficiency of high computational costs of ensemble learning methods.In this paper,a novel knowledge-leverage-based algorithm RankRE-TL is proplosed to solve the problem of transfer learning text categorization.This algorithm integrates the knowledge-leverage-based transfer learning mechanism with a Rank-based Reduce Error evaluation measure(RankRE)to fulfill the transfer learning task.Rank-based ensemble selection method is conceptually the simplest and possesses performance advantage.The measure RankRE is properly modified from Reduce Error(RE)pruning for transfer learning.The design idea of RankRE is to find the candidate classifier which is expected to improve the classification performance of the extended subensemble the most.Moreover,for the problem that the data in the source domain is a certain similarity with the data in the target domain,but there is a serious imbalance in the number of labeled data between these two domains.A dynamic training dataset regrouping is proposed in RankRE-TL algorithm.Specifically,a number of sub-datasets are extracted by using bootstrap sampling technique according to different proportions of source examples.Then combine these sub-datasets with the relatively scarce labeled target samples to get a number of reconstructed datasets,respectively.And simultaneously,a new construction method of validation set is designed for RankRE-TL,which differs from the method used in conventional ensemble selections.However,the ensemble selection method based upon RankRE evaluation measure is a greedy method,which is likely to be limited to the local optimal solution.In order to deal with the problem and transfer the knowledge from source domain more effectively,this paper proposes a new TrGASVM transfer learning method which incorporates TrSVM and GASEN technique.TrSVM firstly learns a lot of source models based on dynamic training dataset reconfiguration,and to acquire a number of support vector datasets from source models and estimates the weight of the support vector through measuring similarity degree between the support vector and the target training dataset.Then combine these support vector sets with the target training dataset to get a number of new training datasets,respectively.Thus the transfer learning SVM ensemble system is acquired by using this approach.The algorithm GASEN(Genetic Algorithm based Selective Ensemble)is a heuristic algorithm for combinatorial optimization,it utilizes genetic algorithm to fulfill the selection of classifiers from ensemble system.The GASEN can not only inherit the strengths of genetic algorithms but also avoid the problem of local optimal of greedy ensemble pruning method.TrGASVM incorporates TrSVM and GASEN to deal with transfer learning task,thus this method have the advantages of the TrSVM and GASEN to transfer knowledge of the source domain effectively.
Keywords/Search Tags:Transfer learning text categorization, Selective ensemble, Rank-based ensemble selection method, Dynamic training dataset regrouping, SVM, GASEN
PDF Full Text Request
Related items