Font Size: a A A

Research On Decision Tree And Its Application

Posted on:2009-07-24Degree:MasterType:Thesis
Country:ChinaCandidate:B WangFull Text:PDF
GTID:2178360242972733Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data Mining (DM) is a technique that aims to analyze and understand large source data and reveal knowledge hidden in the data. Classification is an important technology in data mining, while Decision tree classification is a very effective classification method. So far, people put forward many kinds of decision tree classification algorithms. Each has its strong point on executing rate, expansibility, intelligibility of output and accuracy of classification. However, these algorithms still have some shortages. Further optimizing decision tree algorithm will not only help to perfect its theory, but also its popularization and application.As a popular algorithm of Decision tree, ID3 is widely used because of its simple idea and facile realization. However, the structure of the tree produced by this algorithm is usually too large and complex,thus the performance of the algorithm is restricted. In order to enhance the efficiency of the tree-producing process and avoid "overfitting",we improved ID3 algorithm. The improved algorithm takes the classification effect of each classifying attribute into account,that is,if the classification effect reach a certain level,the process of classification of that branch will be terminated,and proposes an improved algorithm by using the maximum class support and adopting pre-pruning strategy. The experiment results show that the improved algorithm can make decision tree more simple without reducing precise.This thesis focuses on the improvement on ID3, and analyses the performance of improved algorithm according to the experiments. Then, it conducts mining for the in the connection of food preparation. There are two steps in the mining. First, we pre-process the data sample of smart card for campuses with SQL,use clustering method in SPSS to divide the time range and treat the results as one of the inputs in step two,that is,classification.In step two, we use improved ID3 to conduct the mining and gain rewarding results.
Keywords/Search Tags:data mining, classification, decision tree, ID3 algorithm, smart card for campuses
PDF Full Text Request
Related items