Decision Tree Induction Algorithms Based On Attribute Purity Degree

Posted on:2021-02-19

Degree:Master

Type:Thesis

Country:China

Candidate:Y S Yao

Full Text:PDF

GTID:2370330623473247

Subject:Mathematics

Abstract/Summary:

PDF Full Text Request

Decision tree induction algorithms can be divided into decision tree induction algorithms based on information entropy and decision tree induction algorithms based on rough sets.The decision tree algorithms based on rough sets use the information function to select the split nodes when the feature selection fails due to the granular conflict,which reduces the model classification accuracy.To solve this problem,a granular computing mechanism is used to propose attribute purity degree to represent the accuracy and characterization,and attribute dependency degree is further combined to construct the decision tree induction algorithms.Related content involves the following four aspects.1.First define the purity concept of the conditional granule regarding a decision class(i.e.,the Micro-Bottom purity).In the Micro-Bottom purities of the conditional granule for each decision class,the result with the largest statistical value is selected to indicate the purity of the conditional granule for decision classification(i.e.,the Meso-Middle purity).Finally,a statistical integration strategy is used to establish attribute purity degree(i.e.,Macro-Top purity).The attribute purity degree characterizes the accuracy of decision classification for condition classification,and it implements attribute evaluation and feature optimization,so it can be used as a criterion for split nodes selection in decision trees.So far,a three-layer purity system has been established and has a bottom-up hierarchical integration relationship.2.Based on the quantitative identification characteristics of attribute purity degree,a onestage decision tree induction algorithm based on attribute purity degree(i.e.,Algorithm P)is established,and this algorithm is analyzed to have problems such as complex model structure.Further reveal representation differences between attribute purity degree and attribute dependency degree regarding granule structure and classification membership,and the result underlies subsequent reasonable construction of two-stage algorithm.3.Analyze the heterogeneity of information gain rate and attribute dependency degree,and clarify the homogeneity of attribute purity degree and attribute dependency degree.On the basis of the one-stage decision tree algorithms based on attribute purity degree,the attribute dependency degree is combined to establish a two-stage decision tree induction algorithm(i.e.,Algorithm DP),where qualitative attribute dependency degree is first used and quantitative attribute purity degree is second used.4.Decision table analysis and data experiments verify heterogeneity of information gain rate and attribute dependency degree,homogeneity of attribute dependence degree and attribute purity degree.Related results demonstrate the effectiveness and improvement of the proposed two-stage algorithm,i.e.,Algorithm DP.In summary,by the construction of three-layer purity,a quantitative measure of classification accuracy is obtained(i.e.,the attribute purity degree),and it is selected as the attribute importance degree index in the decision tree construction.Based on the homogeneity of attribute purity degree and attribute dependency degree,the two measures are introduced to construct a two-stage decision tree algorithm,and Algorithm DP has higher classification accuracy and better recognition ability.

Keywords/Search Tags:

Rough set, Decision tree, Attribute dependency degree, Attribute purity degree, Feature selection, Machine learning

PDF Full Text Request

Related items

1	Research On Multiple Attribute Decision Making Method Based On Interval Rough Numbers
2	Research On Multiple Attribute Decision Making Method And Attributes Reduction With Interval Rough Numbers
3	Uncertainty Measurement And Attribute Reduction Of Interval-set Decision Information Systems
4	Rough Sets Theory And Attribute Reduct-Decision Rules Optimization In Information Systems
5	Research About Decision-making Methods Based On Interval Number Information
6	The Study On Sorting Methods And Applications Of The Uncertain Multi-Attribute Decision Making Based On Interval Rough Numbers
7	The Mothodology Of Mutil-Attribute Decision Making Based On Rough Set Theory
8	Uncertain Multiple Attribute Decision Making Methods Study Based On Interval-valued Intuitionistic Fuzzy Sets
9	Research On Multi-attribute Decision Making Method Based On Intuitionistic Fuzzy Entropy
10	Using Mutual Information For Selecting Continuous-valued Attribute In Decision Tree Learning