Font Size: a A A

Determine The Tree Algorithm And Optimization, Data Mining

Posted on:2009-09-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y DingFull Text:PDF
GTID:2208360245967339Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Data classification has become one of the important research aspects of data mining. Data mining generates precise description or model of the predetermined set of data classes or concepts by giving data object partition according to the features of a group of data objects. These models then can be used to classify future data objects which has a good prospect in application. The most popular classification methods at present include genetic algorithm, decision tress, neural network, etc.Among the three methods mentioned above, the decision tree algorithm is simple in description and is easy to translate it into classification rules. However it can hardly find the global optimum solution. Although the genetic algorithm can solve the problem of huge searching space, multiple-peak value, and non-linearity, it also has the drawback of early convergence. Therefore, a classification rule mining method called GADA based on hybrid genetic and simulated annealing algorithm is proposed. This algorithm introduces direct tree encoding method to improve the accuracy. Meanwhile it introduces hybrid optimization to solve the problem of local optimization. We also improve such aspects of fitness function and pruning operation to make the accuracy of the mining rules much higher and the algorithm simpler and easier to understand. All these are explained in the following experiment.We use four different databases: weather database, Cleveland database,Heart Disease database and Breast Cancer-W database to compare the result of GSDA algorithm and classic ID3 algorithm. It is proved that the GSDA algorithm performs better than ID3 algorithm.
Keywords/Search Tags:data mining, classification rule, genetic algorithm, GSDA algorithm
PDF Full Text Request
Related items