Font Size: a A A

Research On Particle Swarm Optimization Weighted Random Forest Algorithm

Posted on:2018-05-07Degree:MasterType:Thesis
Country:ChinaCandidate:X X ChengFull Text:PDF
GTID:2348330515973236Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Random forest(RF)algorithm is a classification model proposed by Breiman in 2001.Its essence is to combine Bagging's Bootstrap Aggregating algorithm with Ho's Random Subspace algorithm.It is through the decision tree classification results to take a voting mechanism to determine the final classification results.Since random forest algorithm had been proposed,it has widely used in data mining and classification issues.Later,many scholars have made improvements to the model.Random forest is an efficient classification algorithm.The advantages of random forests are that it does not need the prior knowledge of samples and does not have to select the characteristics.Besides,random forests also has a high noise tolerance.So the complicated data preprocessing can be omitted.But the voting mechanism in the model will lead to some of the lower training accuracy of the decision tree also has the same voting ability so that the voting accuracy has been reduced.And the number of decision tree trees and other parameters in the random forest model usually have a great influence on the final classification results of random forest.For those with low training accuracy and relatively poor voting capacity.In this paper,the traditional random forest algorithm is experimentally analyzed and put forward the traditional random forest shortage place in performance.Random forest voting mechanism will lead to some of the lower training accuracy of the decision tree also has the same voting ability.This has a greater impact on the accuracy of the final classification of random forests.At the same time as the classification,It may also produce the highest number of votes in the same category and eventually lead to difficult to classify the phenomenon.This article defines this phenomenon as a "dead phenomenon".For solving such low-precision decision-making and high number of votes to bring the classification of difficult problems.This paper proposes an accuracy weighted random forests model.In voting,each decision tree is multiplied by a weight which is proportional to its training accuracy.For the parameters difficult to select the problem.The parameters contained in this model are selected through the iterative optimization by particle swarm optimization algorithm.The simulation experiment is designed,The six standard datasets in the UCI database are simulated and verified by Matlab software.Finally,the advantages and disadvantages of the newmodel are compared with different algorithms.By comparison with conventional algorithms and other related algorithms,The conclusion is that the new model has the advantage of classifying such data.
Keywords/Search Tags:Random forest, particle swarm optimization, C4.5 algorithm, decision tree
PDF Full Text Request
Related items