Font Size: a A A

Research And Application On The Decision Tree Classification Algorithm Of Data Mining

Posted on:2008-09-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y FengFull Text:PDF
GTID:2178360215965092Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Decision tree is the most universal models adopted in application of classification. Compared to the Neural Networks(NN) and Bays method, it doesn't need a lot of time and hundreds of iterations to train models but suitable for the large set of data. More over, the classification accuracy of decision tree is better than other techniques', and the algorithm needs no other information but the training data information. The core issue of decision tree algorithm is the strategy in choosing test attribute and pruning to the decision tree. Discretization the continuous attributes and dimension reduction to the high dimension data are critical techniques to extern the decision tree algorithm's application domain.Based on the decision tree, the main research contents of the thesis as follows:(1) A novel dimension reduction algorithm is proposed. First, the importance of all the condition attributes is ordered. Then the attributes are reduced by NN which need no prior knowledge and have more efficiency in classification. And then some attributes are selected to reduce the dimension, which have more valid in classifying data.(2) A weighted binary search algorithm is proposed to discrete continuous attributes. It is simpler, easier to implementation and more efficiency than the classical binary search algorithm which have the shortcomings in simply in partition the area and getting into the local maximum point.(3) An improvement in the attribute selection criterion is proposed. It conquers the shortcomings of ID3 and C4.5 algorithms at deflection problems in selecting testing attribute. It has less computing time and improving the classify efficiency of decision tree' classficator.(4) Based on the former works, optimization and conformity is applied to the classical decision tree. An improvement to algorithm procedure is proposed. Comparing to the C4.5 algorithm, experiment results show the superiority. (5) The algorithm is applied in an image database data mining system. It trains the characteristic data extracted from the image, and then a decision tree is created. At last the data is classified. The results are more transparency, transplant and validity.The research work is supported by key national science and technology project of the "Five-year plan key technology research and demonstration of Integrated Risk Guardians"(No.2006BAD20B02).
Keywords/Search Tags:data mining, decision tree, discretization, dimension reduction, attribute choosing
PDF Full Text Request
Related items