Font Size: a A A

Counterfactual Causal Inference Method And Application Based On Observational Data

Posted on:2023-08-02Degree:MasterType:Thesis
Country:ChinaCandidate:J H YangFull Text:PDF
GTID:2568307037453504Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Although most data science has focused on using the tools of statistics and machine learning to make predictions,the root of many problems is causality.With the popularity of the causal inference application domain,how to determine causal relationships from observed data has become a hot topic of current research.Data collected in real environments are often mixed types with potential variables,and many methods of causal inference can only perform a single type of causal inference under unconfounded conditions,which causes many correlations to be misidentified as causal relationships and ultimately leads to lower credibility of the causal structure.Based on this,the main elements of the research on this topic are.1.In this paper,we propose a causal relationship inference algorithm based on observational data(ODCI),which can determine causality directly from binary mixed variables and solves the potential problems of most current algorithms.First,the method uses a combination of numerical and morphological methods to analyze the correlation between variables;then the causal influence between variables is determined according to the quasiexperimental method,and the credibility of the experimental results is improved by the method of sign test;finally,the causal direction between variables is determined by a systematic summary analysis.In this paper,the feasibility of the method is confirmed theoretically,the accuracy is verified on the real public dataset Cause-Effect Pairs,and the experimental results show that the accuracy of the algorithm is improved compared with the traditional PNL,IGCI,ANM,and Lingam algorithms.2.To accurately quantify the causal effects between patient survival time and various variables,this paper proposes a counterfactual causal inference algorithm based on observational data(CCIA).This algorithm combines the ODCI algorithm proposed in this paper with statistical methods to quantify the causal effects between individual factors and survival time in lung cancer patients.First,the statistical method is used to obtain the set of parent-child nodes of the target node,and then the ODCI algorithm proposed in this paper is used to construct a local causal network graph,and the causal structure of this node and its parent-child nodes is updated to the causal network of the whole data set until all nodes are added to the causal graph to obtain the complete causal network structure;finally,the hypothetical intervention method is used to quantify the target variables in the causal graph corresponding to the causal relationships.In this paper,experiments were conducted on simulated data and real lung cancer patient data,respectively,and the experimental results showed that NLR,smoking,treatment modality,and lung cancer stage were the main factors affecting the prognosis of lung cancer patients,and leukocyte count was the most important indirect influence on patient prognosis,and the causal effect between NLR and the patient prognosis was-0.7.
Keywords/Search Tags:Causal inference, Hypothetical interventions, Cancer, Quasiexperiments
PDF Full Text Request
Related items