Font Size: a A A

Searching for the Contemporary and Temporal Causal Relations from Data

Posted on:2013-02-07Degree:Ph.DType:Thesis
University:The Chinese University of Hong Kong (Hong Kong)Candidate:Wang, ZhenxingFull Text:PDF
GTID:2458390008974291Subject:Statistics
Abstract/Summary:
Causal analysis has drawn a lot of attention because it provides with deep insight of relations between random events. Graphical model is a dominant tool to represent causal relations. Under graphical model framework, causal relations implied in a data set are captured by a Bayesian network defined on this data set and causal discovery is achieved by constructing a Bayesian network from the data set. Therefore, Bayesian network learning plays an important role in causal relation discovery. In this thesis, we develop a Two-Phase Bayesian network learning algorithm that learns Bayesian network from data. Phase one of the algorithm learns Markov random fields from data, and phase two constructs Bayesian networks based on Markov random fields obtained. We show that the Two-Phase algorithm provides state-of-the-art accuracy, and the techniques proposed in this work can be easily adopted by other Bayesian network learning algorithms. Furthermore, we present that Two-Phase algorithm can be used for time series analysis by evaluating it against a series of time series causal learning algorithms, including VAR and SVAR. Its practical applicability is also demonstrated through empirical evaluation on real world data set.;We start by presenting a constraint-based Bayesian network learning framework that is a generalization of SGS algorithm [86]. We show that the key step in making Bayesian networks to learn efficiently is restricting the search space of conditioning sets. This leads to the core of this thesis: Two-Phase Bayesian network learning algorithm. Here we show that by learning Bayesian networks from Markov random fields, we efficiently reduce the computational complexity and enhance the reliability of the algorithm. Besides the proposal of this Bayesian network learning algorithm, we use zero partial correlation as an indicator of conditional independence. We show that partial correlation can be applied to arbitrary distributions given that data are generated by linear models. In addition, we prove that Gaussian distribution is a special case of linear structure equation model. We then compare our Two-Phase algorithm to other state-of-the-art Bayesian network algorithms on several real world Bayesian networks that are used as benchmark by many related works.;Having built an efficient and accurate Bayesian network learning algorithm, we then apply the algorithm for causal relation discovering on time series. First we show that SVAR model is incapable of identifying contemporaneous causal orders for Gaussian process because it fails to discover the structures faithful to the underlying distributions. We also develop a framework to learn true SVAR and VAR using Bayesian network, which is distinct from existing works. Finally, we show its applicability to a real world problem.
Keywords/Search Tags:Bayesian network, Causal, Relations, Data, Real world, Show, Markov random fields
Related items