Font Size: a A A

Research And Application Of Causal Inference Algorithm Based On Structural Equation Model

Posted on:2020-09-17Degree:MasterType:Thesis
Country:ChinaCandidate:X LiuFull Text:PDF
GTID:2428330596976527Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years,causal inference on observed variables has received a widespread attention in the scientific field.Compared with correlation or other statistical relationships,understanding the causal relationship between variables is more valuable in the field such as medicine,economics,sociology and so on.At present,a series of classical causal inference models based on conditional independence or structural equation model have emerged,but the causal inference methods based on conditional independent will produce Markov equivalence classes,so this paper mainly researches the method based on structural equation model.We do the following two tasks:(1)Given the joint distribution of two variables X and Y,structural equation model assumes that the effect Y is a function of the direct cause X and some noise N.With proper model constraints,certain asymmetric property which only holds in the true causal direction,can be derived to conduct inference.However,most of the research focus on the problem that the data types of variables belong to the same type,that is,they are continuous random variables or discrete random variables.In this paper,we focus on how to infer the causal direction when the data types of two variables are different,that is,one is a discrete random variable and the other is a continuous random variable.We propose the model for mixed-type data based on additive noise model,considering the truth that the cause and the noise are independent only in the correct causal direction,according to the information theory,the asymmetry between the forward model and the backward model is derived and the guiding principles for causal inference are proposed.Given observed data,the discrete regression algorithm and continuous classification algorithm are proposed to calculate the residual entropy,and empirical results on synthetic and real data have demonstrated the effectiveness of our proposed model.(2)In the analysis of large-scale causal pairs,due to the limitations of model hypothesis on observations,a single algorithm based on structural equation model is not ideal for analyzing the causal relationship of multiple data pairs generated by different generation mechanisms.Therefore,given the training data,we can use the machine learning methods to train the model,learn the statistical properties between causal data,and predict the causal relationship of the data pairs as the supervised learning.We propose to use the algorithm based on structural equation model as feature,pre-process the data by standardization,discretization,relabeling,etc.,extract features and train the model with logistic regression,random forest,XGBoost.Better experimental results have been obtained compared with the previous studies.
Keywords/Search Tags:causal inference, mixed-type data, structural equation model, machine learning, feature extraction
PDF Full Text Request
Related items