Font Size: a A A

Construction And Disposition Of Combinatorial Classifier Based On Decision Tree

Posted on:2009-08-30Degree:MasterType:Thesis
Country:ChinaCandidate:J B HuFull Text:PDF
GTID:2178360245475235Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Decision tree is one of the data mining methods used most popularly. Researches on decision tree emphasize on prediction accuracy, efficiency and decrease of dimensions of datasets. Scalability is a primary feature of decision tree. SURPASS is a decision tree with scalability and ability for dealing with datasets whose size exceed the capacity of main memory, but it lacks in high efficiency when it is dealing with datasets with very large volume. In addition, decision tree uses impurity to select the best split attribute. When dataset dealt by it has large volume, there might be many best split attributes, which provide possibility of building a random forest over the dataset. Traditional single classifier might not meet the requirement of high prediction accuracy and the manners in which data is generated, stored and utilized urge the improvement of classifier. Some scholars have discovered that there exist mutually complementary information between single classifiers and it is suggest that using the information to improve the performance of classifier.In allusion to the efficiency of SURPASS, this paper propose an index based on amount of information in information theory aiming at selecting the attribute with the larger value of amount of information as the best split attribute to reduce the frequency of accessing disk after computing index of amount of information for every candidate attribute. Experiments show that this method is effective. To make index of amount of information feasible abstractly, this paper educe index of amount of information using differential calculus. Through this method two kind of ways in which index of amount of information is calculated are gained and the superiority of index of amount of information is indicated. Also, this paper build a random forest based on SURPASS and verifies the character of random forest by doing some experiments.
Keywords/Search Tags:decision tree, SURPASS, amount of information, linear discrimination analysis, combinatorial classifier
PDF Full Text Request
Related items