Font Size: a A A

Research On Algorithms For Learning Bayesian Networks And Markov Blanket

Posted on:2015-08-31Degree:MasterType:Thesis
Country:ChinaCandidate:X F ZhuFull Text:PDF
GTID:2308330464466765Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Bayesian network is a general graphical model which is developed by integration the probability with graph theory. Since it is based on solid theoretical foundation and has an unique advantage in encoding and reasoning uncertain knowledge. In fact, it has been applied to many domains, such as machine learning, data mining, artificial intelligence and the forecast. However, with the development of high dimensional data, not only has it many difficult in constructing a Bayesian network only by the domain experts, but also most of existing algorithms have trouble in handing with the problem, even impossible. Therefore, the structure learning of Bayesian network has become a hot topic in many fields. Meanwhile, many researchers pay attention to Bayesian classifiers which are powerful tool for classification in data mining. It is noted that feature selection plays an important role in constructing classifiers. As the increasing of data dimension, the feature space become very large, therefore, it is become a challenge for classical methods. This paper focuses on not only structure learning of Bayesian network, but also the analysis of Markov blanket. Furthermore, new algorithms are proposed for solving the above mentioned problems, respectively. The major contributions are outlined as follows:Firstly, in the paper, we propose a novel algorithm, called EFHITON, to learn Markov blanket of the target variable, which is based on the HITON algorithm. In the process of finding the set of parents and children for the target variable, we can store the separate set of T and its non-parents in time. In this way, many repeated CI tests can be avoided during searching the spouse of T. Besides, FEIPC-MB algorithm is proposed. Original IPC-MB laid the foundation for new approach which employs the mutual information to adjust the order of the nodes that are not independent on target variables. Moreover, the algorithm changes the way for randomly selecting condition sets for CI tests.Secondly, a majority of constraint-based algorithms for learning Markov blanket and Bayesian networks rely on a series of conditional independence(CI) tests. However, the determination of higher order conditional independence relations from sample distributions is generally less reliable than that of lower order. At the same time, the number of CI tests is increasing exponentially with the number of variables. These are also challenges for several other algorithms involved CI tests. In the paper, we combine the idea of constraint-based techniques and mutual information knowledge to improve the reliability and efficiency of the PC algorithm,called improved algorithm FEPC. Major steps are explain as follows: Foremost, we alter the sequence of target variables in ascending according to the size of current neighbor nodes, then we operate CI tests with the same size of condition sets for all variables which meet the condition that the number of candidate neighbor nodes of T is larger than the size of condition set. Furthermore, we change the order of nodes which are adjacent to the target variables for CI tests. In addition, we adjust condition sets for all tests rather than randomly select them from its neighbor variables.Finally, simulations and actual examples illustrate that improved algorithms have the superiority with respect to computational complexity, run-time, accuracy and correctness over several existing algorithms, separately.
Keywords/Search Tags:conditional independence test, Markov blanket, separate set, Bayesian network, ordering
PDF Full Text Request
Related items