Font Size: a A A

The Axiomatic Fuzzy Sets Based Fuzzy Decision Tree Algorithms

Posted on:2014-11-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:X H FengFull Text:PDF
GTID:1268330425977331Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Classification is a hot research topic in machine learning and data mining. Decision tree is one of the most important classification algorithms which has been applied in business decision, medical diagnosis analysis, etc. Axiomatic Fuzzy Sets (AFS) theory is a new method to deal with fuzzy information, which provides an effective tool to convert the information in the training examples and databases into the membership functions and their fuzzy logic operations. AFS theory has been applied to knowledge representation, clustering analysis, pattern recognition, etc. This thesis focuses on the fuzzy decision tree algorithms based on AFS theory. Furthermore, some problems such as instance selection and the evaluation of classification rule sets are also studied in this thesis. Main topics include:1. A new method to construct a fuzzy decision tree based classifier under AFS theory, called AFS fuzzy decision tree (AFSDT) is proposed. The fuzzy information gain is used to select the splitting attributes in AFSDT under a level of specificity quantified in terms of threshold δ. As a result the tree structure is affected by the shresholdδ. The Genetic Algorithm is employed to optimize the threshold8for a small final tree with good accuracy. Furthermore, a new leaf node labeling method is proposed which considers the main class of training samples covered by the leaf node as its class label, and the relevance of classification results is quantified by associated confidence levels. The proposed AFSDT can be applied to data sets with mixed data type attributes by the using of AFS theory. AFSDT has been experimented with28UCI data sets and compared the results with those of SVM, KNN, C4.5, FDTs, FS-DT, FARC and FURIA. It has been shown that the accuracy is higher than the ones produced by other methods. The results of statistical tests show that the proposed algorithm performs significantly better than KNN, C4.5, FDTs, and FS-DT.2. The most important problem of decision trees is how to select the splitting attribute for a node. This thesis proposes a new splitting attribute selection method based on the Minkowski fuzzy measure. In fuzzy decision tree each node is regarded as a fuzzy set of the distribution of classes of training samples. The Minkowski fuzzy measure monitors the uncertainty reduction while a node is split by an attribute. Then, the attribute which maximizes the reduction of the uncertainty is selected as the splitting attribute for the node. Furthermore, the relationship between the threshold8and the corresponding tree is given:if δ1>δ2, then the tree with δ1can be obtained by pruning the tree with δ2properly. This is a theoretical foundation for optimizing the tree by pruning. The experimental results on16UCI data sets show that the proposed fuzzy decision tree is better than the fuzzy GINI index and Min-Ambiguity based fuzzy decision trees in terms of accuracy and tree size.3. Based on the Pattern Tree algorithm and AFS theory, a new classification method named AFS based Pattern Tree Rules (AFSPTR) is proposed. AFSPTR generates fuzzy classification rules through aggregation of fuzzy concepts. First, AFS membership functions partition the feature space according to the distribution of the samples, then AFSPTR aggregates the fuzzy concepts with a new aggregation object function involving similarity and entropy which is used for expressing the discriminatory capabilities of the aggregation result. Essentially, this is a balance of accuracy and simplicity. The performance of AFSPTR is compared with the results produced by7different rule-based classifiers including C4.5, Decision Table, JRip, NNge, OneR, PART and Ridor when using8UCI data sets. The results tell that AFSPTR can achieve the smallest rule base and the accuracy of AFSPTR is significantly better than those of Decision Table and OneR.4. A new instance selection algorithm based on the Affinity Propagation clustering is proposed. The instances are grouped into some clusters followed by detecting the representative group centers for each instance and a classifier is trained using the exemplars of the clusters. Furthermore, a new consistency measure of two classification rule sets which is based on the similarity of the partitions of the feature space by the rule sets is proposed. The main idea is to compare the core space of the rules in two rule sets. The proposed consistency can be not only used to measure the consistency of two different rule sets and verify whether the rules in a knowledge base need to be updated, but also useful for the best classifier selection among the classifiers. In the experiments, five decision tree algorithms are compared under the consistency and the consistency of C4.5is checked on17data sets.
Keywords/Search Tags:AFS Theory, Fuzzy Decision Tree, Fuzzy Classifier Design, ClassifierEvaluation
PDF Full Text Request
Related items