Font Size: a A A

Search Of Classification Algorithms For Data Mining

Posted on:2007-06-27Degree:MasterType:Thesis
Country:ChinaCandidate:C PengFull Text:PDF
GTID:2178360185475618Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As people's acquaintance and management ability are improved, the world is described more and more completely and the amount of data becomes huger and huger. There is a phenomenon that the huge data is not taken advantage of efficiently. People need strong tools that can be used to abstract knowledge from data. Data mining technique comes into being for these demands.Classification is an important and active research field in data mining. This paper summarizes important classification algorithms and points out that as the amount of data becomes huger and huger, the running time and scalability of an algorithm is more important and so is that the output of the algorithm is easy to be understood. Among classification algorithms, decision trees are particularly suited for data mining and widely used. This paper summarizes typical decision tree algorithms. Among decision tree algorithms, SPRINT algorithm is an excellent algorithm. SPRINT algorithm removes all of the memory restriction, and it is fast and scalable. The decision tree created by SPRINT algorithm is compact and accurate. The algorithm has also been designed to be easily parallelized. The parallelization exhibits excellent scalability, speedup and sizeup. The combination of these characteristics makes SPRINT algorithm an ideal tool for data mining.This paper researches how to program SPRINT algorithm by C++. The program is tested and the results show the program is accurate and scalable, so SPRINT algorithm is well realized. In addition, this paper researches how to accelerate the speed of constructing decision tree of SPRINT algorithm and presents a new method. The new method can be simply described as follows. Firstly, attributes are divided into normal attributes and special attributes. Secondly, the proceeding of constructing decision tree of SPRINT algorithm is improved by different ways according to the type of attribute in order to accelerate the speed of constructing decision tree of SPRINT algorithm. The new method is tested and results show the method can accelerate the speed of constructing decision tree of SPRINT algorithm efficiently and is better than existing methods in some ways.
Keywords/Search Tags:Data Mining, Classification, Decision Trees, SPRINT Algorithm
PDF Full Text Request
Related items