Font Size: a A A

Improvement And Reseach Of Decision Tree Classification Algorithm Based On The Rough Set

Posted on:2015-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:Z ZhongFull Text:PDF
GTID:2298330434456359Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Data mining is the process of contact between the data found in the given data.Currently,research on data mining people focused on clustering, knowledge analysis,decision support and other aspects.Classification algorithm is one of the coretechnology of data mining and a hot research topic in the field of data mining.Meanwhile,the rough sets an Iortant method of induction as a research datapresentation, data mining, knowledge discovery has a irreplaceable role.The traditional algorithms need to compare all of the test values,the time is notgood performance, the classification accuracy is not high.In order to obtain betterclassification results, reduce the size of the decision tree,Irove the classificationaccuracy,the algorithm based on attribute selection criterion puts the followingIrovements.Firstly, we combined the rough set theory and technology of decision treestogether,with the need for decision attribute value pairwise comparison operation,resulting in the problem of excessive time overhead,Iroved algorithm a is proposed.The algorithm defines the "first Iortance" and the "second most Iortant" twoparameters,reducing the number of comparisons between objects,effectivelyincreasing the time of performance,and ultimately achieved a sIle decision tree.And then,in order to further reduce the size of the decision tree,tree Irovementunivariate did not fully consider the lack of correlation between attributes,theintroduction of a distinction based on the value of multivariate Iroved algorithm b, thealgorithm selects the "first Iortant" inside the parameters of I (a)(the number ofdifferent values) on the top two attributes of a property value as a multivariate testattributes;If the attribute set "the first Iortant" is empty,the "second most Iortant" todistinguish the value of property by order on the property classification,and furtherIrove the decision tree algorithm.Finally,we compared ID3algorithm,built on the basis of distinguishing value(Iroved algorithm a) and the establishment of multi-variable decision tree based on thevalue of the distinction (Iroved algorithms b) of the three effects on the three data setsin UCI database.Through experiments,we found that the Iroved algorithm a reducingthe cost tree while Iroved classification accuracy,however, the size of the generated decision tree further optimized space. Iroved algorithm b constructs a structure of twomore sIle decision tree,classification accuracy is further Iroved..
Keywords/Search Tags:Data Mining, Algorithm of Decision Tree, Rough Set, Discern Value
PDF Full Text Request
Related items