Font Size: a A A

The Algorithm Application Of Bayesian Network In Data Mining

Posted on:2013-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:N YanFull Text:PDF
GTID:2248330371499445Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Bayesian network, which originated in the field of artifical intelligence, is a very useful tool that uses probability statistics in complex areas for uncertainty reasoning and data analysis. In recent years, Bayesian network has been widely applied in many areas, especially in Data Mining, and has made great achievements. The fundamental reason why Bayesian network has got so much concern are as follows. First, it’s a combination of graph theory and probability statistics and it’s visual and clear. Second, we can learn from the prior information and sample data and it’s suitable for processing datas that miss values.This paper focuses on introducing the learning algorithm of Bayesian Network. The learning algorithm of Bayesian Network includes parameter learning algorithm and structure learning algorithm. Intuitively, parameter learning of Bayesian Network indicates the quantitative relationship between variables, while structure learning reflects both the qualitative and quantitative relationship between variables. The contents of this paper are as follows,(1)We analyze the research status of Bayesian Network, state the research history of uncertain knowledge and explain the reason why Bayesian Network is so widely studied.(2) Before the introduction of Bayesian Network, we make a brief introduction on the knowledge of probability theory, information theory and graph theory which it relies on.(3) Parameter learning include two situations, one is with complete data set and the orther is with missing data. In the parameter learning with complete data set, the emphasis of my research is the methods of maximum likelihood estimation and Bayesian estimation, and their strengths and weaknesses are pionted out respectively. In the parameter learning with missing data, we mainly study the situation of missing at random. In this situation, the algorithm we apply is the expectation maximization algorithm, i.e., EM algorithm. In the experiment of this paper, maximum likelihood estimation is contrasted with Bayesian estimation, then its graphic result is revealed. After this we explained why the result has shown as that.(4) Structure learning include two methods, constraint based and grading based. We focus on the method based on scoring. We introduce the optimal parameter likelihood function, Bayesian grading and give the computational formula of these grading. The structure learning algorithm based on grading includes K2algorithm, hill-climbing method, SEM algorithm and so on. In this paper, we introduce family BIC grading when we study K2algorithm in depth. Later, we conduct experimental studies of K2algorithm based on family BIC grading and give the experimental results.
Keywords/Search Tags:Bayesian Network, Data mining, EM algorithm, Bayesian grading, K2algorithm
PDF Full Text Request
Related items