As one of the basic tasks of machine learning,classification is widely used to solve many problems in reality scenarios.The key to deal with the classification task is to generate a classifier that not only can provide structured knowledge representation of data but also has good classification performance.Bayesian network classifiers(BNCs)have long been a popular medium for graphically representing dependencies between predicted attributes,and it can naturally deal with inference problems under conditions of uncertainty by estimating posterior probability distributions of different class labels.The study on BNCs has attracted great attention from researchers since the extensive and successful application of naive Bayes(NB).Due to its impractical conditional independence assumption,NB achieves simplified topology and excellent computational complexity,although its probability estimates may be suboptimal.Therefore,researchers have proposed numerous approaches to improve the performance of NB by alleviating the attribute conditional independence assumption.Among these approaches,AODE,which averages all the probability estimates of a collection of onedependence estimators,performs outstandingly.AODE achieves the trade-off between bias and variance while retaining NB’s simplicity,thus it becomes one of the most popular BNCs.However,similar to NB,the conditional independence assumption of SPODE rarely holds in reality.In addition,the averaging strategy adopted by AODE for ensemble learning ignores the differences among SPODE members,which affects the classification accuracy and the generalization performance of AODE.To further improve the performance of NB or AODE,researchers have proposed many improvement methods,including structure extension,attribute selection,attribute weighting,model selection,model weighting,and lazy learning.Compared with other methods,attribute weighting and model weighting methods can dynamically assign weights to each attribute or sub-model,and improve the classification accuracy and generalization performance of the BNC by increasing the proportion of highconfidence attributes or sub-models in the classification results.Thus,in this paper,we propose a double weighting learning strategy for AODE by applying both attribute weighting and model weighting methods,and then generate a Bayesian network classifier named Double weighting schema of AODE(DWAODE).In the attribute weighting stage,two measures of semi-pointwise mutual information and semi-pointwise conditional mutual information are introduced as the weights of attributes to finely tune the estimate of conditional probability of each predictive attribute and make the generated model fit the data well.In the model weighting stage,a pointwise log-likelihood function is applied to measure the reasonableness of the SPODE’s network topology,and then tune the estimate of joint probability.What’s more,to further address the assumption of independent identical distribution,local learning methods are incorporated in this paper to learn the locally optimal weights for specific data points.In order to evaluate the effectiveness of double weighting schema,zero-one loss function,biasvariance decomposition,root mean square error,Friedman test and Nemenyi test are introduced as evaluation indicators.The experimental results on 34 UCI datasets show that DWAODE is competitive with other state-of-the-art BNCs including WATAN,AODE,WAODE-MI,IBWAODE,AVWAODE-IG and AWAODE,and achieves excellent performance in classification performance as well as data fitting. |