Font Size: a A A

Research On Decision Forest Algorithm Based On Attribute Reduction

Posted on:2020-10-07Degree:MasterType:Thesis
Country:ChinaCandidate:C SunFull Text:PDF
GTID:2428330578968582Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Decision forest is an ensemble learning method based on decision tree.Combining the advantages of ensemble learning with decision tree,decision forest can effectively avoid overfitting and improve the accuracy of classification.However.there are some limitations in the traditional process of constructing decision trees as base learners,which is based oin classical algorithms such as ID3 and C4.5.For example,it can not avoid problems like the duplication of subtrees and multiple selection of ome certain attributes,which directly lead to the increase of model complexity and the difficulty of interpreting the extracted rules.In order to overcome these limitations,we first develop a decision forest algorithm based on attribute reduction,and then we further propose an improved algorithm with the increment of new attributes.By combining attribute reduction as a data processing technique in the domain of rough set theory with decision tree,we propose a new decision tree algorithm,based on which we then develop the decision forest algorithm with the idea of ensemble learning.For classification tasks,the paper consider voting to get the final forecast result.Specifically,we follow the criterion that only to keep the condition attributes that can differentiate samples with different decision attribute values to realize attribute reduction.In other words,only the truly effective attributes are retained when constructing the decision tree.In order to solve the problems existing in the classical tree-building methods--the duplication of subtrees and multiple selection of some certain attributes--the attributes are selected as the best divided ones of each node in the tree in the order of high-to-low importance to classification tasks after the reduction.In the process of building decision forest,for classification tasks,plurality voting is used to integrate outputs of all decision trees and gives the final predictions.We further propose a decision forest algorithm by adding the increment of new attributes.Firstly,an initial decision tree is trained using a certain attribute subset.Then,on the premise of guaranteeing decision tree's forecasting ability has not declined,the remaining attributes are evaluated one after the other whether they can simplify tree's structure by replacing the existing attributes of the tree.Finally,we still use plurality voting to get the output of the decision forest.Experiments show that the proposed decision forest method based on attribute reduction is correct and effective for some data sets.And when introducing ensemble learning method into the tree-building process,the algorithm has less time cost and accuracy of classification is almost the same as before.
Keywords/Search Tags:data mining, attribute reduction, ensemble learning, decision tree, decision forest
PDF Full Text Request
Related items