Font Size: a A A

Bayes nets: A generalized variable elimination algorithm and applications to classification

Posted on:2007-08-29Degree:Ph.DType:Dissertation
University:University of California, Santa BarbaraCandidate:Lei, XiaofangFull Text:PDF
GTID:1448390005979745Subject:Statistics
Abstract/Summary:PDF Full Text Request
The first part (Chapter 2 and 3) of the dissertation introduces an exact inference algorithm for discrete Bayesian Networks, which is an extension of the generalized variable elimination (GVE) algorithm. We globalize the query-oriented algorithm GVE by specifying a minimal query set and allowing observed variables as "bucket" variables. In any network, the set of all leaf nodes is the minimal query set. The proposed algorithm builds a bucket tree of data structure buckets and then updates the bucket tree to produce the marginal probability density for every variable in the network. This algorithm relies solely on independence relations and probability manipulation, without requiring any complex graph theory, thus making it easy to understand and implement.; The second part (Chapter 4) of this dissertation investigates the performance of Bayesian networks in classification and compares its performance with Support Vector Machine (SVM) methods and the Classification tree method, QUEST. When the relationship between variables is known, the Bayesian network method usually works better than SVM methods, especially in high dimensional problems. The advantage of the Bayesian network method also lies in its ability in handling missing values inside both the training and the test set. When the training set is not very large, discarding records with missing values will affect the performance of classification. Bayesian network methods utilize records with missing values and can provide classification comparable to non-missing training set with the same size. Also Bayesian networks can classify records with missing values, whereas SVM does not have this nice property. The Bayesian network method is better than QUEST when missing values present in test set.; The computational parts of this dissertation use Kevin Murphy's Bayes Net Toolbox for Matlab, SVM and Kernel Methods Matlab Toolbox from Insa de Rouen, QUEST Classification Tree (version 1.9.2) by Wei-Yin Loh and Yu-Shan Shih.
Keywords/Search Tags:Algorithm, Classification, Bayesian network, Records with missing values, QUEST, Variable, Methods, Tree
PDF Full Text Request
Related items