Font Size: a A A

Algorithm Research And Application In Structure Learning Of Bayesian Networks

Posted on:2011-02-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y SunFull Text:PDF
GTID:1118360332956999Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Bayesian network (BN) is a graphical representation for probability distributions. Because of its well-defined semantic and solid theoretical foundations, it became an important theory model in the community of artificial intelligence, and also a powerful formalism to encode the uncertainty knowledge; BN has been applied in the fields such as machine learning, medical diagnoses, financial market analysis, and achieves a great success. Usually, it is difficult to construct a Bayesian network only by the domain expert. Therefore, fast and efficient learning from data is very meaningful to its research and application.Based on the domestic and foreign algorithms, this dissertation deeply researches on the related algorithms of the Bayesian network structure learning, and applies these algorithms to real demands, such as, predicting the risk factors of the mild cognitive impairment and the cerebrovascular diseases. The main work is as following:1. K-Nearest Neighbor algorithm (KNN) is one of the classification algorithms widely used in the filed of machine learning and data mining. This paper combines the Bayesian network structure learning algorithm with KNN algorithm (named as BS-KNN). This algorithm improves the evaluation performance in similarity of KNN algorithm. The probility coefficient is higher, the corresponding feature is more important; the effect on the classification is bigger. The experimental results indicate that new algorithm is same with related algorithms in complexity, but the accuracy and stability of the algorithm are improved when there are many numbers of features and bigger sample size in the data set.2. The data incomplete situation occurs frequently, which will cause the accuracy of algorithm is not higher. A new Bayesian network structure learning algorithm based on geometric distribution is presented, which combines geometric districbution with kullback-leibler (KL) divergence, and learns directly Bayesian network from incomplete data. Firstly, using geometric distribution denotes corresponding relationships between nodes. Secondly, using KL divergence expresses the similarity between the relationships. Finally, the estimation of incomplete data is gotten. The algorithm can avoid the problem of exponential complexity in the standard Gibbs sample. The comparison with other related algorithms indicate that the new algorithm has higher accuracy in the most situations.3. Mild Cognitive Impairment (MCI) is thought to be the prodromal phase to Alzheimer's disease (AD), which is the most common form of dementia and leads to irreversible neurogenerative damage of the brain. It is very important to research the related methods for the prevention and treatment of the AD. MCI is not easy to diagnose and need professional doctor make comprehensive diagnosis based on clinical experience. The MNBN algorithm is presented and constructs Bayesian network adopting memeory, attention and demography data, which will decrease the costs of examination in most extent, and increase the objectivity of diagnosis. The clinical experimental results show that the MNBN algorithm gets better effectiveness.4. Cerebrovascular diseases (CeVD) represent a major cause of morbidity and mortality worldwide. It is reported in the literature that CeVD is one of the three major causes of death in human disease. Therefore, it is great significance to strength the survey of risk factors for the CeVD. Firstly, emoploying natural demography information and some physiological index as the risk factors of the CeVD analyze the mutual relationships among them. Secondly, combining the information gain technology makes sure prior sequence of nodes, constructs the Bayesian network, and further researches the probabilistic dependency relationships among the risk factors. Finally, the experiments are done adopting benchmark dataset. Compared with related algorithms, the experimental results show that the model can identify assistantly the risk factors of CeVD in objectivity and effectivity.
Keywords/Search Tags:Data mining, Bayesian Network, Incomplete data, Mild Cognitive Impairment
PDF Full Text Request
Related items