Font Size: a A A

Decision Tree Algorithm And Application

Posted on:2009-07-03Degree:MasterType:Thesis
Country:ChinaCandidate:X L ChengFull Text:PDF
GTID:2178360272963237Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the construction of enterprise information, data warehouse and decision support system technology in the enterprise has undergone an unprecedented application. How will the decision support system application of data mining methods to enterprises become the focus of the study?The main thesis is on the key technologies of the decision tree algorithm of the classification algorithm of data mining.This article first does a review on decision tree classification algorithm. The brief summary is about the main characteristic, advantages and disadvantages, scope of application, present improvement state, application and expecting of typical categorized algorithms of decision tree.Along with the data processing technology's rapid development, the data scale which needs to be processed has been much bigger than before, it already has developed from the initial small database to the present large-scale database and the data warehouse. Then validity, accuracy and space of the data mining became a major consideration in the characteristics. After the former face of the typical decision tree classification algorithm research, the sampling technology will be introduced into the Decision Tree Algorithm C4.5, making small data sets effective algorithm can also given a certain correct rules for the large data sets. Choose standard database of UCI machine learning library as data source, use decision tree C4.5 algorithm based on sampling to dig out classification rules. Testing shows that the method can significantly improve the efficiency of data mining on condition that obtains satisfactory correctness.Immediately combine one concrete application background on the iron and steel enterprises, the improved algorithm will be used in two major aspects: iron and steel enterprises key processes of production costs and a loss of iron and steel enterprises. The first application regards craft route as the breakthrough point, combine the cost analysis projects of enterprises, do data warehouse modeling for production cost processes. Adopt improved decision tree algorithm C4.5 used for massive data, dig out of the key processes in process routes, the classification rules used for costs affect of iron and steel enterprises, processes and matching the best teams. The second application combines the projects of the loss analysis of enterprise sale, do data warehouse modeling for loss analysis, and dig out key influencing factors for iron and steel enterprises loss. Two applications provide a scientific basis for the cost of management for enterprise. At the same time provide a good experience for the establishment of data mining system.
Keywords/Search Tags:Decision tree algorithm, Sampling, Iron and steel enterprises, Cost analysis, Data warehouse modeling
PDF Full Text Request
Related items