| Transfer learning is an effective way to solve the problem of data across the field, it learns new knowledge from the expired data to help cognitive new task. Due to the violation of independent identical-distribution assumption under traditional machine learning, transfer learning has already been applied in many areas successfully. Ensemble learning through the formation of multiple diverse classifier to establish prediction model. Because of its superior stability and broad generalization performance, ensemble learning has become one of hot topics in machine learning community.Based on the background of news text classification, transfer learning and ensemble Bagging algorithm is studied, and a combination of improved algorithm is brought up for classifying under small target training set, which provides a suitable framework. The concept of ensemble learning and its development are explained firstly, then the concept and classification of transfer learning, as well as its applications are introduced. After that, the process of pretreatment of news text data sets is described in detail, parameters and feature select algorithms are discussed and confirmed, which makes the input using for calculating classification model more accurate and suitable.Finally, because target domain training samples are less in quantity, and good quality classification model can’t be established, so a model of cross domain classification based on ensemble Bagging algorithm in transfer framework is explored. This model introduces the source data and screens them, learning mixed data sets, thus it can establish classification model based on ensemble Bagging algorithms.At last, it gets results by voting. By comparing the simulation experiments, the use of ensemble Bagging algorithm of Bayesian based classifier enables the best classification accuracy and generalization performance,not only transfering in the source domain but also classifying in the target domain. Meanwhile, the influence of the number of noise data in the source domain to the classification model is analyzed this paper, experimental results show that the transfer and ensemble model based on Bagging algorithm can partly avoid negative transfer.Above all, differences of feature selection algorithm in the process of data pretreatment are studied according to different feature selection algorithms. Due to rarely find complete English pretreatment processes in view of the domestic search engines, a Chinese text processing procedure is improved in this paper, and a complete set of graphical English text preprocessing method is sorted out. Then, knowledge of transfer learning and ensemble learning are combined to discuss solutions of cross domain data comprehensively. A kind of classification model- ensemble Bagging algorithm based on the selective transfer is presented,and experiments show that this model has better overall performance and has a certain ability to resist negative transfer. |