Research On Hybrid Classification Based On Navie Bayes And Decision Tree

Posted on:2017-03-22

Degree:Master

Type:Thesis

Country:China

Candidate:D M Li

Full Text:PDF

GTID:2308330482979884

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Data is a kind of valuable source all the time. Through the useful information detected from data, the economic benefit could be created in many fields. Data mining is an approach to mine the useful information from low signal to noise ratio data. And classification is one of the most important hotspots in data mining research. There exist numbers of classifica-tion approach, such as naive bayes, decision tree and support vector machine. Naive bayes and decision tree are widely used in many fields for its’ simple, fast and the robust result. For example, the spelling check of Google is based on naive bayes. In this paper, a novel approach is proposed which includes:(1)Even though the result naive bayes is good, but there are still two weaknesses for naive bayes in some special case, the condition of dependence between any two attributes and the rough process of probability estimate. In this paper, an optimization function is proposed for naive bayes to improve performance for the rough process. This approach is called R-NBC. In order avoid the underflow and overfitting, the optimization function consider the normal attributes and the attributes whose conditional probability is zero. The performance are tested via the UCI datasets, whose results show that the classification accuracy are improved by R-NBC compared with traditional approach, especially for high dimensional data.(2)For multi class labels classification problem, an approach combined with naive bayes and decision tree is proposed in this paper. Because of the noise in instances, the accuracy of decision tree may decrease while overfitting. So, before to analysis the data via decision tree, R-NBC is applied in training dataset in order to remove the noise instances. The results of mixture approach of decision tree and R-NBC are shown that the proposed method is effective for the challenging multilevel classification problem. In fact, it could also help us to extract the importance attributes and high signal noise ratio datasets from the noised dataset with high dimension attributes.(3)Furthermore, in the end of this paper we conclude a GUI for the algorithm which could analysis the data to show the classification results directly on the screen.

Keywords/Search Tags:

Classification Algorithm, Naive Bayes, Probability Estimate, Decision Tree

PDF Full Text Request

Related items

1	Research On Text Classification Algorithm Based On Naive Bayes Method
2	The Research On An Improved Algorithm For Incremental Induction Of Decision Tree
3	Research On Naive Bayes Classifiers And Its Improved Algorithms
4	Bayesian Classification Model Based ON ISOMAP Algorithm And ITS Application
5	The Research Of User Classification Algorithm Based On The Regularized-naive Bayes
6	Comparing Classifiers In Data Mining
7	Research On Personal Credit Evaluation Based On Decision Tree Integration Algorithm
8	Research On Text Classification Algorithms Based On Machine Learning
9	Classification Based On Influence Functions
10	Research On Bayesian Networks-Based Text Classification Algorithms