Font Size: a A A

Theoretical Research Of Optimization Function Based On Node Sorting In Bayesian Network Structure Learning

Posted on:2020-11-24Degree:MasterType:Thesis
Country:ChinaCandidate:K Q ZhangFull Text:PDF
GTID:2428330596472500Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Bayesian network is one of the effective methods to deal with the uncertainty problem.Because of its inherently strict probabilistic reasoning and intuitive graphical representation,it is widely used in artificial intelligence and machine learning.In today's big data explosion era,traditional methods of relying on expert knowledge to construct Bayesian networks can no longer meet the needs of fast and accurate learning.How to effectively learn Bayesian network structure from data has attracted the interest of many experts and scholars.The K2 algorithm using the hill climbing search strategy has been widely used as a classic Bayesian network structure learning algorithm.However,the K2 algorithm has strong dependence on the variable sequence and the maximum number of parent nodes.The difference learning efficiency of the maximum number of parent nodes is not significant,and different variable sequences can greatly affect the learning of Bayesian networks.Efficiency,so how to find a better sequence of variables has important research value.In this paper,the evaluation function of the variable sequence in K2 algorithm is taken as the research goal.Firstly,based on the breadth-first search Kahn algorithm and the depth-first search Tarjan algorithm,the standard network structure is traversed to obtain better variable sequences and we analyze the sequences.The K2-CH score and the mutual information score attribute,looking for the commonality of the better sequence,on this basis,we propose a new variable sequence evaluation function,and then use this scoring function as a new fitness function,in the space formed by the variable.In order to find a better variable sequence by genetic search,this algorithm is called Chain-KMGA algorithm.The experimental results show that the Bayesian network learned by the Chain-KMGA algorithm has a better structure,which is characterized by higher scores in the network,and can learn more correct edges and fewer error edges,while Chain-KMGA algorithm has less running time.The algorithm has higher learning efficiency.The Bayesian network scores learned have a strong positive correlation with the variable sequence scores,that is,the scores of the variable sequences increase.Bayesian network scores are also correspondingly increased,so the Chain-KMGA algorithm can search for better variable sequences,and the Bayesian network constructed by Chain-KMGA algorithm has higher learning efficiency.
Keywords/Search Tags:Bayesian network structure learning, Kahn algorithm, Tarjan algorithm, K2 algorithm, Mutual information
PDF Full Text Request
Related items