Font Size: a A A

Research On Offline And Online Algorithms For Real Time Inference In Complex Diagnostic Bayesian Networks

Posted on:2013-02-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y X HuangFull Text:PDF
GTID:1118330371481388Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Bayesian networks (BNs) is a tool which helps people to apply probability andstatistics to complex tasks as well as carry on probabilistic inference and dataanalysis. Probabilistic inference is the process of answering queries throughcomputation. The problem of probabilistic inference in BNs includes threecategories: the problem of posterior probability, the problem of maximum aposteriori hypothesis and that of most probable explanation. Here, the problem ofposterior probability is: compute posterior probability for a set of query variables Q,given the observations e to a set of evidence variables E; i.e., to compute P(Q|E=e).BNs inference algorithms consist of exact and approximate inferencealgorithms. Conventional exact inference algorithms include, for example, the classof query-based algorithms (e.g., Variable Elimination (VE)) or the class ofall-marginals algorithms (e.g., Clique Tree Propagation (CTP)). Conventionalapproximate inference algorithms include, for instance, random sampling algorithms,variational methods, model simplification methods, and search-based algorithms.Major problem of conventional exact inference algorithms is: it is very difficult toapply the algorithms to complex BNs since the complexity of the algorithms is toohigh. It is because exact inference is NP-hard. Major problem of conventionalapproximate inference algorithms is:(i) the precision of the algorithms is often notideal;(ii) in complex BNs, the complexity of the algorithms can also be very high. Itis because approximate inference is also NP-hard. Performing real time inference incomplex BNs is challenging.One strategy to improve the efficiency of BNs inference is to divide upinference into an offline one and an online one. The offline one is far and away themost costly computationally, and the online one computes posterior probabilities. Forinstance, Query DAG (Q-DAG) is a state-of-the-art paradigm for implementinginference in diagnostic Bayesian networks (DBNs). Here, the node set of a DBNconsists of three subsets: the diagnostic node set D, the measurement node set M,and the intermediate node set I. Inference in DBNs is to compute posteriorprobability for a set of diagnostic variables Q (Q D), given the observations e to aset of measurement variables E (E M); i.e., to compute P(Q|E=e).Q-DAG includes an offline algorithm called Q-DAG generation (Q-DAG-G)and an online algorithm named Q-DAG evaluation (Q-DAG-E). Q-DAG-G convertsthe problem of computing posterior probabilities into that of evaluating arithmetic expressions; Q-DAG-E evaluates the expressions generated by Q-DAG-G, and thecomplexity of Q-DAG-E is linear in the size of the expressions.However, it is extremely difficult to apply Q-DAG in complex DBNs where|D|(the number of nodes in D) is large. The reasons are as follows.(i) One should use query-based algorithms (e.g., VE) instead of all-marginalsalgorithms (e.g., CTP) to implement Q-DAG-G in complex DBNs. It is because theformer compares more and more favorably to the latter as network complexityincreases.(ii) Q-DAG-G shall generate∑X D∏X∈X|dom(X)|arithmetic expressions for allqueries. For each X D,∏X∈X|dom(X)|expressions can be generated by running thequery-based inference algorithm once.(iii) According to (i) and (ii), the run-time of Q-DAG-G generating arithmeticexpressions for all queries is nearly2|D|times of the run-time of the query-basedalgorithm answering one query.In the paper, we propose offline algorithm Transform-DBN (TD) and onlinealgorithm Compute-Posterior-Probability (CPP) for real time inference in complexDBNs where|D|is large. TD transforms a DBN into a set of factor sets and CPPcomputes posterior probabilities on the factor sets. Here, a factor on variablesX1,…,Xnis a representation of a function from dom(X1)×…×dom(Xn) into the realnumbers, where dom(X) denotes the domain of variable X.Major differences between TD and Q-DAG-G are as follows.(i) Q-DAG-Gdoes not take the structure of a DBN into consideration, while TD explores a uniquestructure of a DBN.(ii) Q-DAG-G is query-based, while TD isquery-category-based.Specifically, TD first prunes DBNs to simplify inference. Second, TD dividesall queries into categories; for each category, TD searches for and eliminates a uniquevariable set to generate a factor set on which answering certain queries (belong to thecategory) is relatively efficient. The run-time of TD is at most n (the number ofcategories) times of that of VE given the same elimination ordering.CPP first chooses factor set according to the query, and second prunes the factorset to simplify inference, and then computes posterior probability on the prunedfactor set. Note that, there are circumstances where CPP may not be as efficient asQ-DAG-E. For example, in the worst case, CPP reduces to VE. However, empiricalresults show that, there are complex DBNs where CPP can not only outperform VEand CTP exponentially, but also terminate in real time.According to the features of Q-DAG, TD and CPP, we believe Q-DAG performsbetter in simple DBNs and complex DBNs where|D|is rather small, while TD andCPP excel in complex DBNs where|D|is large.
Keywords/Search Tags:Complex Diagnostic Bayesian networks, Real time inference, Offline, Online
PDF Full Text Request
Related items