Font Size: a A A

An Enhanced Causal Discovery Algorithm Based On Sparse Causal Graph

Posted on:2024-07-29Degree:MasterType:Thesis
Country:ChinaCandidate:R F ZhangFull Text:PDF
GTID:2530307085498784Subject:Economic big data analysis
Abstract/Summary:PDF Full Text Request
Statistical learning algorithms typically rely on associations among the variables under study,with most associations stemming from underlying causal structures.Therefore,effective inference and utilization of causal knowledge are of significant value in statistical learning.Traditional causal relationship mining methods depend on Markovian properties and constraint optimization.In recent years,researchers have increasingly transformed the problem of causal discovery into a supervised classification problem,designing novel mining methods.However,supervised methods require high-quality training data,and obtaining causal relationship data is expensive,which significantly limits the application of supervised causal discovery methods.In this study,we confront the issue of sparse empirical data on causal relationships within real-world contexts.To tackle this problem,we initially introduce a semi-supervised augmentation model known as C-tri DNN.This method represents an ensemble model built upon the Tri-training framework,consisting of three interconnected deep networks,which is specifically designed for the semisupervised learning of sparse data.To enhance the information extraction capability of empirical data,this paper summarizes existing feature extraction methods at different levels and innovatively proposes the extraction of causal graph topology information as input,significantly improving the quality of information embedding.However,considering that empirical data may not satisfy the assumption of independent and identically distributed samples,and sparse data can impact the model’s generalization ability,this paper also introduces a boosting model based on multi-task learning called CE-MMOE.To tackle this challenge,several unsupervised methods are employed to generate pseudo-labels for sparse data,which are then incorporated as a subtask alongside supervised learning in the multitask learning MMOE model.The primary task can not only learn unique contextual information from empirical data but also the causal data generation mechanism from unsupervised methods,thereby ensuring the lower bound of the model’s prediction performance.Ultimately,experimental results demonstrate that the two proposed methods exhibit significant performance improvements compared to unsupervised and supervised methods when dealing with sparse data sets.
Keywords/Search Tags:Causal discovery, semi-supervised learning, causal graph topological features, multitasking learning, enhances learning
PDF Full Text Request
Related items