Font Size: a A A

The Research Of Bayesian Network Structure Learning Algorithms Based On K2 Score Metric

Posted on:2010-12-08Degree:MasterType:Thesis
Country:ChinaCandidate:H X ZhangFull Text:PDF
GTID:2178360275451303Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Bayesian network (BN) is a graphical representation for probability distributions. Because of its well-defined semantics and solid theoretical foundations, it became an important theory model in the community of artificial intelligence, and also a powerful formalism to encode the uncertainty knowledge; BN has been applied in the fields such as machine learning, medical diagnoses, financial market analysis, and achieve a great success. Usually, it is difficult to construct a Bayesian network only by the domain expert。Therefore, learning a BN structure from data is very meaningful to its research and application.By using K2 network score metric, this dissertation mainly focuses on Bayesian network structure learning problem in the following three directions:1. To find a simpler structure and an easy-implementation algorithm, we introduced the Tabu search into Bayesian network structure learning problems, proposed a Tabu-search-based Bayesian network structure learning algorithm (TBN). First, the new algorithm generates the neighborhood solutions by add, subtract and reverse arc operators. And then, the Tabu list and aspiration criteria guide the search procedure corporately. After the iteration of the two steps above, the algorithm will finally obtain optimal and near optimal solutions. Compared with other algorithms, TBN has a simpler structure and faster speed.2. To solve the drawbacks of the ant colony optimization for learning Bayesian networks (ACO-B), we proposed an improved algorithm based on the conditional independence test and ant colony optimization (I-ACO-B). First, the I-ACO-B uses order-0 independence tests to effectively restrict the space of candidate solutions, so that many unnecessary searches of ants can be avoided. And then, by combining the global score increase of a solution and local mutual information between nodes, a new heuristic function with better heuristic ability is given to induct the process of stochastic searches. The experiment results on the benchmark data sets show that the new algorithm is effective and efficient in large scale databases, and greatly enhances convergence speed compared to the original algorithm.3. To learn Bayesian network structure from incomplete data with higher precision, we proposed an approach combined with both processes of data completing and ACO. First, unobserved data are randomly initialized, thus a complete dataset is got. Based on such a data set, an initialization BN is learned by Ant Colony Algorithm. Second, in light of the current best structure of evolutionary process, Expectation Maximization (EM) estimating and randomly sampling are performed to complete data. Third, on the basis of the new complete data set, the BN structure is evolved by an improved ACO process. Finally, the second and third steps are iterated until the global best structure is obtained. Experiment results show that the approach can effectively learn BN structure from incomplete data, and is more accurate than some recent algorithms.
Keywords/Search Tags:Bayesian network, structure learning, ant colony optimization, Tabu search, missing data
PDF Full Text Request
Related items