Font Size: a A A

The Research And Application Of Decision Tree Based On Fuzzy Theory

Posted on:2018-02-20Degree:MasterType:Thesis
Country:ChinaCandidate:Z H YuFull Text:PDF
GTID:2348330515468011Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the field of data mining,decision classification is a very important work,and decision tree algorithm is not only a simple and efficient algorithm,has but also widely used in classification.The structure of decision tree is simple and easy to understand,moreover,it is reusable and has high classification accuracy.The classical decision tree algorithm is not good at dealing with the fuzziness of data.With the application of fuzzy theory in machine learning and artificial intelligence,fuzzy decision tree algorithm is established by combining fuzzy set theory with decision tree algorithm,such as FuzzyID3,Min-Ambiguity algorithm etc.The fuzzy decision tree algorithm is a perfect method for the classical decision tree algorithm,which has far-reaching influence on the development of the decision tree algorithm,and improve the ability of the decision tree to deal with the uncertainty data.The main work of the thesis includes the following:(1)This thesis expounds the basic knowledge of decision tree and fuzzy theory,summarizes the difference of different decision tree algorithm in the selection criteria of splitting attributes,and analyzes different decision tree pruning methods.This thesis focuses on the comparison between the decision tree and the fuzzy decision about making process,the data preprocessing,the complexity of the algorithm,the method of matching rules and the scope of application,then summarizes their advantages and disadvantages.(2)The method of obtaining continuous attribute clustering center by K-means algorithm is proposed,then blurring continuous data by triangular fuzzy number.Furthermore,the fuzzy decision system based on FuzzyID3 and Min-Ambiguity algorithm is designed.Combining the C4.5 and CART algorithms implemented in Weka open source data mining software,the difference of 4 decision tree algorithms in the classification accuracy and the number of rules is compared by experiments.It is found that the FuzzyID3 algorithm has higher accuracy rate and less rules on each data set.The CART algorithm generates the least number of rules,because the structure of the binary tree and the Gini index are used as the selection criteria of the splitting attribute.Compared with the two fuzzy decision tree algorithms,it is found that the FuzzyID3 algorithm is better than Min-Ambiguity algorithm in general.In addition the influence of truth degree on fuzzy decision tree is analyzed experimentally.(3)The fuzzy decision tree algorithm is applied to the field of email classification,an email classification model based on the fuzzy ID3 algorithm and email behavior is designed,a scheme of email feature attribute selection and the corresponding treatment method is proposed.It is found that the model has high recall rate and precision rate in the classification of email,which can be more effective in identifying spam.
Keywords/Search Tags:Data mining, Decision tree, Fuzzy decision tree, Email classification
PDF Full Text Request
Related items