Font Size: a A A

The Research Of Ensemble Learning Based On Particle Swarm Optimization Algorithm

Posted on:2009-09-20Degree:MasterType:Thesis
Country:ChinaCandidate:T Y LuFull Text:PDF
GTID:2178360242480859Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
This paper is supported by National Natural Science Foundation named "the research of a number of issues in Statistical Relationship Study". It contains the research of PSO, BP, and their application of them in data-mining. It also covers Ensemble Learning which contains Boosting Algorithm, Bagging Algorithm, Selective Ensemble Algorithm and Heteromorphism Ensemble Algorithm. We do research on learning machine and pretreatment in data-mining.Intelligent computing includes many algorithms based on natural principles, which is used to treat solving problems. Intelligent computing covers many algorithms, such as Artificial Neural Networks, Genetic Algorithm, Simulated Annealing and Swarm Intelligence, and so on. In recent years, a technology named Hybrid Intelligent is being a hot spot. this technology integrates the advantages of various of intelligent algorithms to do simulation , reasoning and Problem-Solving. The machines which have this capability are Hybrid intelligence. Also, people have being paid attention to Ensemble Learning.Our work is as follows:1. Do research on Intelligent ComputingWe realize the BP arithmetic, and improve it on its disadvantages. We find that although the BP arithmetic can improve the convergence speed, it will get bad results in some condition. As a result, we propose the PSO-BP arithmetic which is based on the idea of hybrid Intelligent Computing.It uses PSO to optimize BP, and can get perfect results.The results of the experiment show that: improved BP algorithm can improve the convergence rate, but it can not guarantee that we can get a satisfactory model for each time.2. Do research on Ensemble Learning1) We improve the AdaBoost arithmetic, and make it use ensemble learning which can handle multitype sample. We also realize the PSO-Boosting arithmetic.2) Bagging needs unstable sorting methods, such as Decision Tree arithmetic and ANN arithmetic. "Unstable" means that the little change on the data sample can make the sorting result change a lot. We realize the improved Bagging arithmetic based on PSO-BP, also BPO-Bagging is realized.3) We implement the Selective Ensemble Algorithm. In Ensemble Learning, we can choose some of classifiers. It can get better results than combining all the classifiers sometimes. You may ask that if we can get better results when classifiers are more. The answer is no, because it will need more memory space. When there are many classifiers that can be used, it is better to choose some to ensemble than use them all. Because BP arithmetic is sensive on the initialization parameters, We use PSO-BP to train the data sample by changing the initialization parameters, and get the models list. We can realize the wide range selective ensemble.4) We propose a Ensemble Algorithm which contains different types of classifiers. It ensembles many ANN classifiers and a Bayes classifier, and use PSO to optimaze weight.5) We propose the idea that we can use PSO to optimize the weight of ensemble learning. As a method to deal with the sample set which have More than two types, we can make the weight of all the classifier same, or we can set the weight of the sorting machine according to the shooting rate. But the methods above is not supported by theory. In this paper, we use PSO to optimize the weight of the classifier of ensemble learning, and improved the genaralizing ability of the learning machine.The results of the experiment show that: the PSO-BP algorithm is better than the traditional BP algorithm in the accuracy of the training and testing sets; if we ensure the accuracy of the prediction, the convergence rate of training the data set which has not been pretreated is much lower than the one of training the data set which has been pretreated already, pretreated data is much more higher than the one which has not been pretreated in whether accurate or train the accuracy of the set. Because of the data format, noise and other reasons, it can not be convergent sometimes to the data set which has not been pretreated.3. Do research on the individual classifier1) We design and realize the Bayesian classifier. Because the attributes of ANN training data sample must be numerical, we should discrete the attributes of the sample. Discretization chooses the method based on entropy.2) This paper gives the design and realization of BP neural network classifier, the design of BP neural network classifier becomes very important in order to facilitate the ensemble and make ensemble more effective in the research of ensemble learning. The output of BP neural network classifier is not a symbol of a type, but the degree of similarity to each type.3) Data preprocessing includes data cleaning, data integration and data reduction. This paper focus on the ability of identifying the corresponding type of the attribute according its values, it presents a dimensional reduction method and a simple numerical reduction method to preprocess the relevant data.The results of the experiment show that: the training result and predicted result of the vast majority of the data sets by Bagging or Boosting is better than by a single artificial neural network, Boosting is better than Bagging in the predicting the accuracy result of the training set or prediction set for most time. The accuracy of prediction of Bagging algorithm, Boosting algorithm, selective integration algorithm and heteromorphism integration algorithm becomes better if the weight has been optimized by PSO optimization algorithm.4. This paper gives the in-depth research on the individual learning machine and data preprocessing, and realize the main idea of the individual classifier and data preprocessing from the view of ensemble learning and do in-depth research on intelligent computing, improves the capability of the individual learning machine based on the idea of the hybrid intelligent computing. Finally, based on the above studies and work, we do the in-depth research on the ensemble learning, including the realization of Boosting algorithm, Bagging algorithm, selective ensemble algorithm, heteromorphism ensemble algorithm and optimize their weight of individual learning machine by PSO based on the idea of hybrid intelligent learning, realize the corresponding algorithm. So our work have been integrated together.
Keywords/Search Tags:Optimization
PDF Full Text Request
Related items