Font Size: a A A

Optimization Of Bayesian Network Structure Based On Dynamic Threshold

Posted on:2021-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhangFull Text:PDF
GTID:2428330626958918Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Machine learning is able to use the experience gained from data to summarize useful rules.And classification is a common issue in the related area of machine learning.The Bayesian network classifier(BNC)is a typical classification method,extensive research conducted by researchers on BNC has promoted the development of uncertainty.The unique conditional independence assumption of Naive Bayes(NB)makes its network topology the simplest in constrained BNCs.Other classification models can be regarded as extended research based on NB,and the research can be divided into two categories: the tree-augmented naive Bayes,k-dependence Bayesian Network Classifier(KDB)and extended models of KDB etc.,are divided into the first category,which relax the assumption of NB's conditional independence assumption;the second category is to optimize the structure of NB,such as using feature selection method to remove redundant features in NB based on forward sequence selection.Compared with other models,KDB can obtain satisfactory classification accuracy when processing large datasets.We divide the dependencies in KDB into direct dependencies and conditional dependencies,where the direct dependencies represent the dependencies between features and class variable C,and the conditional dependencies represent the dependencies between features when given C.But the problem of whether there is redundancy in in direct dependencies and conditional dependencies and whether the redundant dependencies will affect the classification accuracy has not been considered in KDB.The identification and removal of redundant direct dependencies and conditional dependencies in the classification model will involve the problem of selection.Filter and wrapper are two methods for selecting features with different advantages.The filter method,which is independent of the classification algorithm,can improve the computational efficiency of the model.While the wrapper method uses the classification algorithm as the evaluation function to score the feature subset to obtain the optimal feature set.Forward Sequential Selection(FSS)and Backwards Sequential Elimination are both the selection order in the wrapper method.Compared with the filter method,the final classification model obtained by the wrapper method often has better classification accuracy.Based on information theory,the FeatureSelection(FS)and DependenceSelection(DS)methods proposed in this paper combine the advantages of the filter method and the wrapper method.The search process of FS and DS is performed in a greedy-search strategy,which mainly includes two steps.First,the filter method is used to rank feature or conditional dependence by mutual information(MI)or conditional mutual information(CMI)criteria,and the minimum MI and the minimum CMI in each iteration are updated to the new dynamic thresholds.Then,according to the dynamic thresholds,we use the wrapper method to evaluate the feature subset or dependence subset every time for better 0–1 loss result.We apply both FS method and DS method to KDB,and the final BNC called Adaptive KDB(AKDB).After the model reaches the termination condition,the iteration will stop and we will get the optimal feature subset and the optimal condition dependence subset.To prove that both FS and DS can work severally,we present respectively two versions of KDB: KDB with only feature selection and KDB with only conditional dependence.We conduct experiments on 30 UCI datasets and analyze the dependence number,the change of information and the classification accuracy.The results show that the FS and DS methods can alleviate the potential redundancy problem and can help to improve the classification accuracy.Finally,we provide the overall analysis of AKDB and the results show that AKDB achieves competitive classification performance compared to several state-of-the-art BNCs in terms of 0–1 loss,root mean squared error Bias and Variance.
Keywords/Search Tags:Bayesian Network Classifiers, Direct Dependencies, Conditional dependencies, Thresholding
PDF Full Text Request
Related items