Font Size: a A A

Research Of Causal Relations Discovery For Measurement Models

Posted on:2021-03-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:F XieFull Text:PDF
GTID:1368330602993444Subject:Computer applications engineering
Abstract/Summary:PDF Full Text Request
In recent years,more and more researchers have begun to pay attention to the explanatory role of causal inference in the fields of machine learning,artificial intelligence,etc.,and has been widely used in social science,economics,medical science and other fields.The traditional causal relationship discovery is mostly based on interventions or randomized experiments.However,in reality,this method often requires a lot of human and financial resources,and sometimes it cannot even be completed.Therefore,causal discovery from the observational data has become an important breakthrough and has attracted more and more scholars' attentionStructural Equation Modeling(SEM)provides a basic framework for multivariate analysis and is also a powerful tool for learning causal relationship.Therefore,many scholars and experts try to use SEM,and learn the causal relationship between the observed variables from the perspective of data distribution caused by the causal mechanism.The models,which are based on a structural equation model and combined with a causal diagram model,are collectively called the Functional Causal Model(FCM).Some scholars call it the Structural Causal Models(SCMs).However,most of the existing methods assumed that the observed variables are the variables that actually cause or effect variables,and ignored the effects of the measurement errorsThis paper has conducted in-depth research on the theory of causal discovery and corresponding practical application under the measurement model.This paper has solved the following three problems:how to completely identify the causal network under the linear measurement error model;how to discovery the causal relationship between the latent variables under the 1-Factor measurement model(there is a common latent variable between the measurement variables);and how to discovery the causal relationship between the latent variables under the N-Factor measurement model(there are any common latent variables among the variables)The main contributions of this paper are as follows(1)Research on the problem that completely identifies the causal network from the data containing measurement errors.For the classic LiNGAM model,there are two problems to be met:(a)fail to solve the case where the data noise is Gaussian;(b)the misjudge the causal direction when the data contain measurement errors.In this paper,we propose an information entropy-based two-stage approach for arbitrarily distribution data,and introduce the expert knowledge reliability ratio(reliability ration)to restore real data for the problem of causal discovery with measurement error data.Compared with existing algorithms,our algorithm has the lowest complexity,reaching O(mn2).At the same time,experimental results on real-world causal structures are presented to demonstrate the effectiveness and stability of our method.We also apply the proposed algorithm on the mobile-base-station data with measurement errors,and the results further prove the effectiveness of our algorithm.(2)Study the causal structure estimation of latent variables under 1-Factor measurement model.In this paper,by properly leveraging the non-Gaussianity of the data,we propose a testable Triad condition that can estimate the structure of latent variables.Specifically,we design a form of "pseudo-residual" from any three related variables,and show that when causal relations are linear and noise terms are non-Gaussian,the causal direction between the latent variables for the three observed variables is identifiable by checking a certain kind of independence relationship.In other words,the Triad condition can help us locate the latent confounders and determine the causal direction between latent variables.For non-Gaussian data,the Triad condition far beyond the Tetrad condition and reveals more information about the underlying latent structure.In addition,based on the Triad condition,we develop a two-stage algorithm to learn the causal network of latent variables in the measurement model Finally,we verified the correctness of the algorithm theory on the simulation data,and we also apply the proposed algorithm on the Hong Kong stock market data and are able to detect the latent variables and their causal relationships behind the stock data.(3)Study the causal structure estimation of latent variables under the N-Factor measurement model.In this paper,we consider the linear non-Gaussian causal model and proposes a Generalized Independent Noise condition(GIN)for estimating the causal structure of latent variables.Specifically,for any two random vectors X and Z,we first find a parameter vector based on the cross-covariance between X and Z,and GIN holds if and only if ?TX and Z are statistically independent.Graphically speaking,if this conditions holds,then in the linear,non-Gaussian latent model the latent common causes of variables in X d-separates X from Z.The independent noise condition,i.e.,that if there is no confounder,cause are independent from the error of the regressing the effect on the cause,can be seen as a special case of the GIN.Furthermore,we propose a two-step algorithm to locate the latent variables(including the numbers)and learn the causal order of the latent variables.Experimental results on the synthetic and real-world data demonstrate the effectiveness of our method.
Keywords/Search Tags:Causality, Causal discovery, Functional causal model, Measurement model, Latent variable
PDF Full Text Request
Related items