Font Size: a A A

The Research Of Optimizing Algorithms Decision Tree Based On Rough Set Theory

Posted on:2014-05-12Degree:MasterType:Thesis
Country:ChinaCandidate:X P WangFull Text:PDF
GTID:2268330425957585Subject:Basic mathematics
Abstract/Summary:PDF Full Text Request
Many important research areas such as artificial intelligence andmachine learning etc. are intefered in data mining (DM) which is also known asdatabase knowledge discovery. Classification is one of the key contents of DM.Currently, it is primary applied in fields such as diagnosis, prediction, distinction,screening, etc.. Compared with other classification models,it is simple and easy tounderstand, easy to operate, and can ensure the classification accuracy not lower thanthat of other classification model. Rough set can handle uncertain knowledge. It can beused to find the better internal relationships among the inaccurate and noise data, andgenerate more robust and optimizing decision tree. In this paper, various kinds ofdecision tree alogorithms based on rough set theory are deeply studied. The main worksare as following:1.Based on variable precision rough set theory (VPRS), the concepts of variableprecision area and non varaiable precision area are defined to replace the originalprecision area and non precision area concepts.2. The HACRS decision tree algorithm based on rough sets in univariate formproposed by Jinmao-Wei et al. is carefully analysed both in its advantages anddisadvantages. A new decision tree algorithm HACBRS based on variable precisionrough set theory in univariate form is proposed with the combination of these twoconcepts, in adition to replace the information gain standard of ID3by the C4.5’s.3. In the procedure of partition of the data set by HACBRS algorithm, with theintroduction of parameter of error classification, the impact of a small amount of noisedata on the results can be weakend so that the result decision tree is not overfited whichmakes the decision tree generation ability of generalization capability be greatlyimproved.4. Compared these algorithms with the classic ID3in practical example. 5. By anlysing and comparing of descision tree based on rough set theory and theinformation entropy of ID3, the scale of decision tree of RS is smaller and the form ofthe decision of RS is also simpler, at the same time, noise data’s impact can besuppressed and classification accuracy will be much higher. So, we can conclude thatthe algorithm based on variable precision rough set theory has certain improvedadvantages when comparing with the classic RS-based algorithm.
Keywords/Search Tags:Data Mining, Decision Tree, Rough Set, Variable Prescision Rough Set
PDF Full Text Request
Related items