Font Size: a A A

Research On The Algorithm Optimization Of Decision Trees Classification

Posted on:2008-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:P L ChenFull Text:PDF
GTID:2178360215985567Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data mining means the process of extracting cryptic and potential helpful information from a mass of Data. It is one kind of brand new Data analysis technology and popular in the field of banking finance, insurance, government, education, transportation and national defense etc.Data classification is one of important contents in Data mining.There are many methods for Data classification, and the Decision Tree classification algorithm bases on the instances amongst these is widely used with its advantages of convenience for getting apparent rules, smaller calculation workload, showing important decision characteristics, higher classification correctness etc. Decision Tree algorithm is currently one of the most popular in Data mining algorithms according to related statistics.There is some issues in the most existent decision tree algorithms, while applied to the reality tasks, namely multi-value bias, lower efficiently in computation etc. Therefore, it possesses important theoretic and factual significance to make further improvement and raise the performance for decision tree, so as to make decision tree more suitable for the requirement of the factual application.This article deeply makes researches aiming at the above-related Data base knowledge discovery issues, and the purpose is to probe into optimization and combination of Decision Tree in Data mining, in order to be applied to the reality tasks. The involved contents exist as follow:Firstly, this paper introduces the basic theory of Data mining and Classification technology macroscopically , and analysises and comparisons of decision tree algorithms were especially emphasized on.Secondly, a combined optimization decision tree algorithm that is suitable for multi-dimension data-base is proposed. Compared with traditional classification algorithms, the algorithm makes improvements from several aspects: reducing dimension, attribute selection, scalability, pruning etc. The main point are that this paper proposes a new decision tree algorithm, DTBAC algorithm, which is based on weighed attribute's consistency & optimization pruning, and the FAVC set which enhances the algorithm scalability.Finally, a combined optimization decision tree classifier using a DTBAC algorithm is developed, which has been used in medicine field to classification patient. Analysis results show that DTBAC algorithm is better in many aspects than ID3 algorithm which is widely used in many fields currently.
Keywords/Search Tags:data mining, classification, Decision Tree algorithm, ID3, DTBAC
PDF Full Text Request
Related items