Font Size: a A A

Multistage Bayesian Network Learning Framework Using A Label-driven Approach

Posted on:2020-12-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y SunFull Text:PDF
GTID:2428330575480525Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
The rapid development of machine learning ushers in a new era of artificial intelligence.Classification is a fundamental issue in machine learning.Among various classification techniques,Bayesian network classifiers(BNCs)are well-known for their comparable classification performance with low training cost,model interpretability and their ability to directly handle multi-class classification problems.Naive Bayes(NB)was the very first BNC.It assumes all the attributes are independent given the class.However,when dealing with real-life applications with complex attribute dependencies,the strong attribute independence assumption of NB is often violated,which deteriorates its classification accuracy.The tree-augmented naive Bayes(TAN)classifier relaxes the attribute independence assumption of NB by allowing one-dependence relations between attributes.The k-dependence Bayesian classifier(KDB)enables us to build classifiers at arbitrary points along the attribute dependence spectrum.Combining outputs from multiple classifiers,known as ensemble learning,can often help to achieve better overall accuracy than any of its constituent classifiers.Ensemble methods can be divided into parallel and sequential ensembles according to whether their subclassifiers are trained independently or not.The weighted averaged TAN(WATAN)is a typical example which adopts parallel ensemble.As a sequential ensemble method,multistage classification is characterized by the property that a prediction is made by a sequence of classifiers with increasing complexity.Coarse classification is first performed by early stage classifiers and will be refined in later stages to obtain more accurate results.Multisatge classification can drastically improve classification accuracy with low computation cost,meanwhile it allows great flexibility in choosing its components.Therefore,multisatge classification has been widely applied in numerous fields.When some labels have high confidence degrees(or posterior probabilities)close to that of the predicted label,BNCs tend to make a wrong prediction.To alleviate this problemand improve classification accuracy of BNCs,we devise the label-driven multistage Bayesian network learning framework(MBLF).MBLF expands the classification procedure of a testing instance into three stages,namely the preprocessing stage,the label filtering stage and the label specialization stage.In the preprocessing stage,a conventional BNC which serves as a generalist classifier estimates posterior probabilities of all labels.If there exists more than one high-confidence label,the testing instance will be reclassified in the following two stages.The label filtering stage first removes low confidence labels and then upgrades the generalist into a refined generalist by rebuilding the model where only credible information derived from high-confidence labels is exploited.In the label specialization stage,an expert classifier each targeting a high-confidence label is built to model label-specific attribute dependencies.The final classification result is obtained by averaging the predictions of the refined generalist and the experts.We apply the proposed MBLF to TAN and name the resulting model Label-driven TAN(LTAN).We conduct extensive experiments on 40 benchmark datasets from the UCI machine learning repository.Empirical results reveal that LTAN demonstrates significant accuracy advantage over not only some state-of-the-art single-structure BNCs but also several well established ensemble BNCs without incurring too much computation overhead.Therefore,our proposed MBLF is proved to be effective for improving classification accuracy.
Keywords/Search Tags:Bayesian Network Classifiers, Ensemble Methods, Multistage Classification
PDF Full Text Request
Related items