Font Size: a A A

Research On Stochastic Optimization Based On Bayesian Network Classifiers

Posted on:2024-07-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y RenFull Text:PDF
GTID:2568307064997119Subject:Engineering
Abstract/Summary:PDF Full Text Request
One of the key challenges in Artificial Intelligence is to develop models that provide structured representations of domain knowledge.For a long time,Bayesian network(BN)have been the mainstream medium for reasoning and knowledge representation under uncertain conditions.Bayesian Network Classifier(BNC)is a special type of Bayesian networks,mainly used for solving classification problems.How to reduce the complexity of network topology and make the learned joint probability distribution fit data are two important but inconsistent issues for learning BNC.Classical BNC models(e.g.,TAN,KDB),these BNCs can only represent a limited number of conditional dependencies,which are always the most significant.Information-theoretic metrics,e.g.,mutual information(MI)and conditional mutual information(CMI),are commonly applied to roughly quantify the mutual or conditional dependence between the attributes.However,due to the limitation in structure complexity and computation complexity,the biased estimate of high-order conditional probability may result in poor performance especially when processing small data.To address this issue,researchers propose to learn ensemble of classifiers.By transforming one single high-order topology into a set of low-order ones,ensemble learning algorithms can include more hypothesis implicated in training data and help achieve the tradeoff between bias and variance.Bagging and boosting are the two most popular ensemble learning approaches,each of them trains classifiers with different subsets of the training data.However,resampling from training data can vary the results of member classifiers of the ensemble,whereas the potentially lost information may bias the estimate of conditional probability distribution and then introduce insignificant rather than significant dependency relationships into the network topology of BNC.In this paper,we propose to learn from training data as a whole and apply heuristic search strategy to flexibly identify the significant conditional dependencies.In this process,the attribute order is implicitly determined,and we get our base learning algorithm—Random Bayesian classifier.In this paper,RBC is used as the basic algorithm,and then multiple different classifiers are randomly obtained.The collection of these classifiers is called Random Bayesian Forest.Random sampling is introduced to make each member of the ensemble“unstable” and fully represent the conditional dependencies.The resulting highly scalable algorithm,called RBF(random Bayesian forest),combines the low variance of ensemble learning with the low bias of high-dependence topology.Ensemble of classifiers performs better than its committee members on average.Different attribute orders and augmented edges will form different BNCs.It also shows that the classification performance of different BNCs varies and,in some cases,varies greatly.Randomness helps to independently learn RBCs which describe the true Bayesian network from different aspects.The wrong prediction from BNC A may be corrected by BNC B.In order to verify the effectiveness of the RBF algorithm proposed in this paper.The experimental evaluation on 40 UCI datasets reveals that the proposed algorithm,called random Bayesian forest(RBF),achieves remarkable classification performance compared to the extended version of state-of-the-art out-of-core BNCs(e.g.,SKDB,WATAN,WAODE,SA2 DE,SASA2DE and IWAODE)in terms of zero-one loss,RMSE,bias-variance decomposition and Friedman test.
Keywords/Search Tags:Bayesian network classifier, Ensemble learning, Stochastic optimization, Random sampling
PDF Full Text Request
Related items