The Research On An Improved Algorithm For Incremental Induction Of Decision Tree

Posted on:2008-03-14

Degree:Master

Type:Thesis

Country:China

Candidate:Y S Liu

Full Text:PDF

GTID:2178360272968280

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

In online classification system, such as customer behavior analysis,Web log analysis and intrusion detection, it is an important problem to make the classifier to adapt for new samples, and insure it can categorize right and keep on working. To solve data increment problem, there already have a few incremental decision tree induction algorithms. But the storage of these algorithms is expensive for preserving large amount of history sample data generally. And to assure the structure is consistent with the traditional decision tree, they need to carry out testing of structure on decision tree and adjust whenever gaining a new sample, this adjustment needs certain calculation price. So they can't satisfy online classification system's need. Data increment problem is comparatively one simple kind of increment problem, there still exist class increment and attribute increment problems which is especially complicated in real-world. The traditional incremental decision tree algorithms have attached importance to the research to data increment problem, but have ignored the research to class increment and attribute increment problems.In order to solve the three kind increment problems, an improved hybrid classifier algorithm is put forward based on research of decision tree induction algorithm and Bayesian method. The new algorithm combines the merit of decision tree induction method and naive Bayesian method. It retains the good interpretability of decision tree and has good incremental learning ability. When increment problem happens, the algorithm apply the model already learned to new increased sample, it carries out incremental learning on basis of count information in history and the sample to gain knowledge contained in the sample. So it ensures that the classifier is Real-time and effective.To evaluate the performance of the new hybrid classifier algorithm, the contrast experiment between the new algorithm and the existed decision tree induction algorithm is presented. The experiment data comes from the UCI standard database. The experiment results show that the new algorithm can solve the increment problems in data mining easily and good. Be compared with the performance of reconstruct decision tree with traditional method, the new algorithm spends fewer time and can classify samples more accurately. So it's more suitable for online classification system.

Keywords/Search Tags:

Data Mining, Decision Tree, Naive Bayes, Incremental Learning, Estimated Probability

PDF Full Text Request

Related items

1	Research On Hybrid Classification Based On Navie Bayes And Decision Tree
2	Incremental Learning Of Naive Bayes Chinese Classification System
3	Research On Personal Credit Evaluation Based On Decision Tree Integration Algorithm
4	Research On Incremental Learning Algorithm Of Decision Tree For Intelligence Large Data
5	Based On Decision Tree Incremental Learning Imaging Target Classification Technology Research
6	Comparing Classifiers In Data Mining
7	The Research On Several Problems Of Bayesian Theory
8	Research On Spam Behavior Patterns And Recognition Methods
9	Research And Application Of Data Mining Technology Used In The Analysis Of Mobile Customers' Loss
10	Data Mining Systems And Their Applications - Improve The Performance Of The Naive Bayes Text Classifier, Associated Characteristics