Font Size: a A A

The Comparative Study On Augment Bayesian Classifier And Its Optimization

Posted on:2016-06-26Degree:MasterType:Thesis
Country:ChinaCandidate:J X TangFull Text:PDF
GTID:2308330461952859Subject:Applied Economics, Statistics
Abstract/Summary:PDF Full Text Request
Naive Bayes classifier is an important family classifier, its form simple, with easy calculation of low memory requirements and high execution efficiency, compared with other classifiers, due to its decision rules by adopting the idea of a posteriori probability maximization, which is very easy to understand, easy to promote in the field of data mining and has good application value.Naive Bayes classifier relies on a strong assumption, namely the conditional independence. Conditional independence requirements attribute variables about class conditional probability is independent with each other. The positive effect of the assumption is to simplify the complexity of conditional probability in computation. With the assumption, the calculation of joint conditional probability is simplified to calculate the product of a number of independent conditional probability with respect to some random variables. Conditional independence, however, also has its own defects, because in reality, data always does not meet the conditional independence assumption, potentially causing the deterioration of the accuracy of naive Bayesian classifier.This article circles around the conditional independence, extends to different forms of comparative study of naive Bayesian classifier algorithm, hoping that through comparative study to explore the improved algorithm of naive Bayesian classifier, with reading many relevant papers on the improvement, eager to explore a new kind of naive Bayesian classifier which could be more accuracy than the ones before.Referring to relevant articles, know that comparative study of improved algorithm of Bayes classifier is concerned on the conditional independence is not been set up. In general, improved algorithm, takes two different way to look at the assumption of conditional independence. The one way relies on the idea that in order to better adapt to the assumption, some variables which could be cancel in the model if it violate the assumption, in this way, we could ensure that the feature variables we choose could heavily meet the assumption, this method makes feature variables approximately meet the requirement of the assumption, to some extent, it really improve the accuracy level. The other way is, to reserve all the feature variables, breaks the original strong conditions, namely conditional independence assumption, modified to another form that admitting the assumption is not a must when the data in reality, so in the net model the assumption will not be set up. The original assumption just only admitting the relationship between the feature variables and the class variable, one to one, and canceling the relationship between the feature variables with each other. In this way, a new method call Bayes net model has been introduced. The net model admits the co-relationship between the feature variables each other or even making some compromise, thinking the feature variables could be divided into different feature sets. The different feature sets is conditional independent with each other.And in the same feature set, the feature variables are co-dependent with each other. So the assumption could be applied between different feature sets. To some extends, the method does work.This research method the article using is the comparative study. The study mainly focuses on the accuracy, execution efficiency, but with classifier accuracy is the most important index, because currently the classifier algorithm performance can be solved by distributed parallel computing, improving the efficiency of its execution. To a great extent, for the classifier accuracy index determines the level of classifier in practice.The research contents of this article is about the simple Bayesian classifier algorithm optimized development in a comparative way. In the comparative study, the main research focus on the evolution process of the improvement of naive Bayes classifier and the thoughts along with these improvements. In the optimize process of the algorithm, the main purpose is to put forward a new algorithm improvement ideas, in order to improve the accuracy of the traditional Bayes classifiers. In the end, I put forward a new method called FDNB-k(FDNB), which based on the feature delete...
Keywords/Search Tags:naive Bayes classifier, Bayesian net model, Conditional independence, comparative study, Algorithm improvement
PDF Full Text Request
Related items