Font Size: a A A

Decision Tree Induction Algorithms Based On Attribute Purity Degree

Posted on:2021-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y S YaoFull Text:PDF
GTID:2370330623473247Subject:Mathematics
Abstract/Summary:PDF Full Text Request
Decision tree induction algorithms can be divided into decision tree induction algorithms based on information entropy and decision tree induction algorithms based on rough sets.The decision tree algorithms based on rough sets use the information function to select the split nodes when the feature selection fails due to the granular conflict,which reduces the model classification accuracy.To solve this problem,a granular computing mechanism is used to propose attribute purity degree to represent the accuracy and characterization,and attribute dependency degree is further combined to construct the decision tree induction algorithms.Related content involves the following four aspects.1.First define the purity concept of the conditional granule regarding a decision class(i.e.,the Micro-Bottom purity).In the Micro-Bottom purities of the conditional granule for each decision class,the result with the largest statistical value is selected to indicate the purity of the conditional granule for decision classification(i.e.,the Meso-Middle purity).Finally,a statistical integration strategy is used to establish attribute purity degree(i.e.,Macro-Top purity).The attribute purity degree characterizes the accuracy of decision classification for condition classification,and it implements attribute evaluation and feature optimization,so it can be used as a criterion for split nodes selection in decision trees.So far,a three-layer purity system has been established and has a bottom-up hierarchical integration relationship.2.Based on the quantitative identification characteristics of attribute purity degree,a onestage decision tree induction algorithm based on attribute purity degree(i.e.,Algorithm P)is established,and this algorithm is analyzed to have problems such as complex model structure.Further reveal representation differences between attribute purity degree and attribute dependency degree regarding granule structure and classification membership,and the result underlies subsequent reasonable construction of two-stage algorithm.3.Analyze the heterogeneity of information gain rate and attribute dependency degree,and clarify the homogeneity of attribute purity degree and attribute dependency degree.On the basis of the one-stage decision tree algorithms based on attribute purity degree,the attribute dependency degree is combined to establish a two-stage decision tree induction algorithm(i.e.,Algorithm DP),where qualitative attribute dependency degree is first used and quantitative attribute purity degree is second used.4.Decision table analysis and data experiments verify heterogeneity of information gain rate and attribute dependency degree,homogeneity of attribute dependence degree and attribute purity degree.Related results demonstrate the effectiveness and improvement of the proposed two-stage algorithm,i.e.,Algorithm DP.In summary,by the construction of three-layer purity,a quantitative measure of classification accuracy is obtained(i.e.,the attribute purity degree),and it is selected as the attribute importance degree index in the decision tree construction.Based on the homogeneity of attribute purity degree and attribute dependency degree,the two measures are introduced to construct a two-stage decision tree algorithm,and Algorithm DP has higher classification accuracy and better recognition ability.
Keywords/Search Tags:Rough set, Decision tree, Attribute dependency degree, Attribute purity degree, Feature selection, Machine learning
PDF Full Text Request
Related items