Font Size: a A A

The Study And Design Of Self-filtering Bayesian Classification Model Based On Information Theory

Posted on:2016-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y NingFull Text:PDF
GTID:2298330467999899Subject:Data mining
Abstract/Summary:PDF Full Text Request
In2012, with the coming of big-data era, people realized the importance of the mass datacreated by information explosion for the enterprise. In many fields, the decisioninterdependence with data analysis in deepening gradually, and will break away from thetraditional experience and intuition finally. For example, in the field of medicine, with therapid development of data mining and medicine, In the case of a large number of patient datastored, people began to think more seriously about the auxiliary diagnosis and treatmentapplication of data mining technology in the medical field. If people use it correctly, auxiliarydiagnosis and treatment will greatly reduce the workload of doctors and the rate ofmisdiagnosis.A lot of data mining techniques applied cannot express causality among attributes.However, Bayesian networks can find dependence by using mutual information, and show thecausal relationship through the graphical network. It is an important means of dealing withuncertain information. Therefore, proposing a Bayesian classification model which has a highclassification accuracy rate and can clearly show the causality is practical significance.Na ve Bayes is the most simple restricted Bayes classifier. On the basis of NB, people putforward many Bayes classifiers which have higher classification accuracy rate, such as NB,TAN and KDB. However, because NB and TAN are zero-dependence model andone-dependence model, though excellent performance in small data sets, but in large data setsthey can’t meet the demands. KDB(K=2) have an advantage in large data sets, but in smalldata sets its effect is not as good as NB or TAN.Therefore, this paper proposes a Bayesian classifier which has clear structure and higheraccuracy rate on the basis of studying the theory and mechanism of the classical Bayesclassifier. Based on KDB, the paper proposes a two-dependence Bayesian network classifierwhich is more efficient. In order to make the model more precise, when sorting theattributes,the influence of candidate parent nodes need be considered, then we find thedependence among attributes in this order to build a perfect Bayesian network structure calledDynamic2-DB, short global model. But its accuracy is not high enough. So we use localmutual information and local conditional mutual information to construct Local2-dependenceBayes classification model called local model according to D2-DB’s rules. The accuracy oflocal model is very unstable. After that we see a law about―Error classification is fuzzy‖through a lot of experiments. Based on the law, we propose a Bayesian model which is moreaccurate. We combine the local model and D2-DB basis on the law. This new classifier isself-filtering Bayesian classifier(SBC). Experiments prove that SBC has a wider range of application, and its accuracy is higher.
Keywords/Search Tags:Bayesian Networks, Self-filtering, Local Mutual Information, Bayes Classifier, DynamicStructure
PDF Full Text Request
Related items