Font Size: a A A

Universal Target Learning Strategy For Bayesian Network Classifiers

Posted on:2021-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:S Q GaoFull Text:PDF
GTID:2428330626958928Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Machine learning is the science of using algorithms to parse data,learn from it,and then make decisions and predictions about events in the real world.In the past few decades,statistical models such as Bayesian networks,neural networks,and support vector machines have been proposed and applied to many real-life applications,such as medical diagnostics,search engines,and biometrics,etc.Bayesian network classifiers?BNC?have proven their competitive classification performance in various practical applications.Highly scalable BNC with high expressive ability has gradually become an important research direction in the field of machine learning.Bayesian network classifier is a powerful tool for knowledge representation and reasoning under uncertain conditions.By learning a more reasonable network structure,BNC can usually achieve higher classification accuracy.However,learning the structure of unrestricted Bayesian networks has proven to be NP-hard.To solve this problem,scholars are studying the relevant knowledge of restricted network structures.In many restricted BNCs,NB assumes that all attributes are independent given the class label.Even if the assumption of conditional independence between attributes is unrealistic or even wrong in most data,the performance of the NB classifier is surprisingly good.In order to make up for some limitations of the NB classifier,Friedman et al.introduced a TAN classifier which added one-dependency relationship between attributes to relax the conditional independence assumption of NB.KDB goes further and theoretically allows the expression of feature dependencies of any order.Using a single parameter k can achieve a good balance between classification performance and structural complexity.In order to improve the prediction accuracy relative to a single model,ensemble learning trains multiple types of models to solve the same problem and combines them to obtain a more robust model.As one-dependency integrated BNCs,AODE and WATAN methods generate multiple global models from a single learning algorithm through randomization,which can help to obtain better overall accuracy on average.Any model learned from the training data may not be suitable for all test cases.In order to mine implicit dependencies in unlabeled test cases to mitigate the negative impact of classification bias caused by overfitting,this paper proposes a universal target learning?UTL?strategy for Bayesian network classifier research.On the one hand,the framework targets each unlabeled test instance and builds an"unstable"Bayesian model BNC?.In order to make BNC?and BNC?obtained from the labeled training data mutually complementary and effectively used in combination,this article applies the same learning strategy to construct them.On the other hand,this paper is based on the log-likelihood LL????|T?analysis in information theory to fully dig the important dependencies between attribute values in a specific instance to help optimize the network structure.We introduce conditional entropy as a loss function to measure the coded bits in BNC with log-likelihood.We use KDB as an example to study the impact of UTL on BNC?and BNC?.In this paper,UTL is applied to KDB to obtain the UKDB model.The experimental results based on 40 UCI datasets show that the classification accuracy of UKDB is not only better than single structure BNC?NB,TAN,KDB,etc.?,but also has significant advantages compared with integrated models?AODE,WATAN,etc.?,which strongly proves UTL effectiveness of the framework.
Keywords/Search Tags:Information Theory, Universal Target Learning, Bayesian Network Classifier
PDF Full Text Request
Related items