Font Size: a A A

Research And Application In Ensemble Of Incremental Learning Algorithm Based On CART Algorithm

Posted on:2020-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:M WangFull Text:PDF
GTID:2428330623956577Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the context of the widespread use of big data,how to effectively train massive data and improve the scalability and accuracy of the prediction model.How to overcome the "stability-plasticity dilemma " in traditional machine learning algorithms,and improve the ability of the model to process continuous influx of data,so that the predictive model can continuously and quickly learn new data and get useful knowledge and information from it to guide subsequent decision-making,has become an urgent problem in practical applications.The incremental learning algorithm can not only process massive amounts of data incrementally,but also overcome the "stabilityplasticity disaster" and efficiently learn new data.It has the ability to continuously train and update models.Decision tree algorithm is one of the most widely used algorithms in classification algorithm of machine learning,but it does not have the ability of incremental learning.Therefore,the research of incremental decision tree has been widely concerned.At present,the incremental learning algorithm for decision tree,such as the ID5 R algorithm and its improvements,are constantly adjusted decision tree based on the initial decision tree,and always use one decision tree as a classifier.However,the classification capabilities of a single classifier are limited,and ensemble learning can significantly improve the generalization of the learning system relative to a single classifier.Therefore,this paper mainly studies how to improve the CART decision tree by using the ensemble learning method,make the improved algorithm has the ability of incremental learning,and can process massive data incrementally?learn new data efficiently.Then,aiming at the problems existing in the ensemble incremental learning algorithm,construct a better subset of classifiers by using the "selective integration" theory,further improve the model classification performance.The main research contents are as follows:1.In order to overcome the "stability-plasticity dilemma" of CART decision tree algorithm,combining the CART decision tree algorithm and Learn++ incremental learning algorithm to realize a CART decision tree incremental learning algorithm based on ensemble learning method—I-CART algorithm.This incremental learning algorithm enables the CART decision tree to have incremental learning capabilities,improving the efficiency of learning new data and classification performance.2.In order to enhance the objective fairness of the base classifier voting weight in the I-CART algorithm,avoiding the excessive weight of samples that hard to classification affect the classification performance of the ensemble classifier.In this paper,using the Kappa coefficient as the voting weight of the base classifier in the I-CART algorithm,the I-CART.Kappa algorithm is implemented,which further reduces the classification error rate of the I-CART algorithm.3.For the integrated incremental learning algorithm,the size of the final integrated classifier is too large,resulting in excessive memory space consumption and classification rate reduction.Based on the idea of "selective integration",this paper fully studies the relationship between the diversity and the accuracy of the base classifiers,and proposes two selective ensemble algorithms named “vertical scribing method” and “horizontal scribing method”.The above selective ensemble algorithm can significantly reduce the size of the ensemble classifier and improve the classification performance by selecting base classifiers with large diversity and high precision.4.Designed the comparison experiments on the UCI dataset confirmed the effectiveness of the I-CART.Kappa algorithm and the selective integration algorithms.5.On the AOTP flight information dataset,constructed the prediction model of flight delay by using the algorithm of this paper,to prove the efficiency and practicability of the incremental learning algorithm proposed in this paper.Designed and developed a real-time Flight Delay Prediction system with Apache Kafka,realized real-time forecasting of flight information,real-time training of flight forecasting models and automatic updating.
Keywords/Search Tags:incremental learning, CART algorithm, ensemble learning, selective ensemble
PDF Full Text Request
Related items