Font Size: a A A

Causal Inference For Partial-Linear Model

Posted on:2019-07-19Degree:MasterType:Thesis
Country:ChinaCandidate:J Y LaiFull Text:PDF
GTID:2310330542973372Subject:Application probability statistics
Abstract/Summary:PDF Full Text Request
Causal inference is a discipline that devotes to exploring causal relations between things on the basis of statistical disciplines.Since the birth of modern natural science,more and more scientists have put forward the understanding of causality.After constant evolution and integration,the theory was transferred from Laplace's "causal determinism" to Hume's "causal empiricism",and now to the "causal probability" the public widely accepted,which fully reflected the importance of causal reasoning.As a science based on quantum theory and probability theory,causal inference has widely used in various fields of social science,such as educational research,behavioral research,psychometric,econometrics,sociology and epidemiology and biostatistics since it was born.As data from statistical inference often are complex and varied,so are data from causality,for instance high and low dimensional data,linear and non-linear or partially linear data,continuous and discrete data,and missing and complete data and so on.Since the reasoning method without modeling based on the Bayesian conditional probability to the one with modeling,and then to the one based on high-dimensional data,different methods of causal reasoning depend on different data types.In the existing researches on causal inference with modeling,most of them are used for analyzing linear model and nonlinear model.In light of the fact that the real relationship of actual data is not often complete linear or non-linear.In addition,if we insist on using linear or non-linear causal theory to explore their causal relationship at the moment,we may get invalid conclusions.If the wrong model would be adapted to fit the data,mass of information had might be lost.The better way is to use the mixture of the two models,partial-linear model.Unfortunately,there is no existing research on causal inference for partial-linear regression models.Therefore,this paper focuses on the causal inference of partially linear models,so as to extend largely the application of causal inference in more fields.This paper makes some reviews to the present situations and the investigation history of three models including linear,nonlinear and partial-linear model,as well as some related problems at home and abroad.Besides,it briefly outlines the causal inference theory of linear models and nonlinear models,based on which a method of causal inference for partial-linear model is proposed.The method is called as partial-linear-kernelized-trace method(abbr.,PL-KTM),which is divided into two steps: firstly,estimate the unknown parameters and the undiscovered functions of this model by using the theory of reproductive kernel Hilbert space and profile local least-squares method with the penalty term;secondly,establish some criteria to determine the causal relations for the partial-linear model.To widen the application field of the method,this paper also puts forward a partial-linear causal inference method for discrete data plus some illustrations to verify efficiency.Then statistical simulation analysis is conducted to validate the rationality of the method proposed in this paper.Meanwhile,concrete proofs of some important theorems are given.Finally,this dissertation applies our methods to a case study of causal analysis for solitary-papillary thyroid carcinoma metastasis data.Considering that the pathological variables of the data have a higher dimension and the response variable(cancer metastasis)is discrete in this program,this paper makes the following work: dimension reduction,correlation diagnosis,serialization for cancer metastasis variable,and the verification of serialization,before analyzing the partial-linearly causal relations of thyroid cancer data.In order to ensure the rationality of the serialization of cancer metastasis,the PL-KTM was used to carry out causal analysis on the pathological variables that reduced the dimensionality and the new metastasis variable that was serialized.In this process,the correlation diagnosis is used to illustrate the difference between association and causality and the actual details are described later.In the end,there is a summary of our results to the full text as well as a prospect.In conclusion,the proposed method both causal reasoning algorithm PL-KTM for partial-linear model and the serialization method based on linear discriminant analysis can effectively analyze the causal relations for many data in different fields.The method has the following advantages: firstly,without the limit of variable dimension,it's applicable for both low-dimensional and high-dimensional data.Second,without the limit of variable type,it's adapted to both discrete and continuous variables.Thirdly,it's also available for the data with partial-linear relations.
Keywords/Search Tags:causal inference, partial-linear model, profile local least-squares method, high-dimensional data, discrete data
PDF Full Text Request
Related items