Font Size: a A A

Essential Issues In Mendelian Randomization Method And Its Application In Lung Cancer Risk Study

Posted on:2022-06-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y ZhuFull Text:PDF
GTID:1484306743497254Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Observational research is susceptible to confounding factors,reverse causality,and various biases.Causality between modifiable risk factors and adverse outcomes derived from observational studies is often unreliable.Mendelian randomization(MR)study uses genetic variant as an instrumental variable(IV)to determine whether the observational association between exposure and outcomes(diseases)is causal association.In the last decade or so,with the rapid development of genome-wide association studies(GWAS),MR methods have been favored by more and more investigators.However,in practical applications,several key issues in the MR method are often overlooked,such as whether the effect of genetic variants(single nucleotide polymorphisms,SNP)on exposures meets the requirements of instrumental variable strength,whether the pleiotropic problem of genetic variants is effectively dealt with,and whether there the effects of genetic variants on exposures are heterogeneous in population stratification or other subgroup,etc.This study mainly focuses on evaluating the index of instrumental variable strength and whether to adjust covariates,and discusses the problems existing in the practical application of MR method based on two-stage least squares(2SLS)method.In Section?,in the two-stage least squares method,we discuss the problems within the existing evaluation indicators of the strength of instrumental variables and propose to use the variation of exposure explained by all instrumental variables in the first-stage regression(?R~2,the R~2-statistic of the model when only instrumental variables included in the first-stage regression)to present the strength of instrumental variables.The simulation results suggest that the F-statistic and R~2-statistic used to evaluate the strength of instrumental variables in the past are not suitable for MR study.They will be affected by factors such as the sample size and the number of instrumental variables.The F-statistic will increase as the sample size increases and the number of instrumental variables decreases.When the?R~2 remains fixed,with the increase of the F-statistic,power has no changing trend;with the increase of the sample size or the decrease of the instrumental variable,the F statistic shows a significant increasing trend,while power still has no obvious trend.When the?R~2 increases,power will increase,accordingly;and as the sample size increases,the rate of rise of the power also increases.Because genetic variants have small effect on exposures,?R~2 tends to be very low.A large sample size is required to obtain a higher power.In particular,when causal effect is about 0.3,if?R~2 is very low(around 0.005),14,000 samples are sufficient to ensure power reaching 80%;and as the?R~2 increases,the sample size required will also decrease sharply.In addition,?R~2 needs to be 0.038 if we want to obtain 80%power,when sample size is 2000.And?R~2 required gradually decreases with sample size increases.Furthermore,there are following conclusions about whether it is necessary to adjust the covariates in the 2SLS.If covariates are included in the first regression to construct prediction model of exposure,they must be also adjusted in the second stage to control the bias of causal effect.If covariates are not included in first stage regression,it will not affect the causal estimates when they are adjusted or not in second stage.And the bias will be weakened by increasing?R~2.If the covariates are adjusted in neither stages,the standard error is larger than other scenarios.If the covariates are adjusted in second stage,the standard errors are similar in scenarios in which covariates are adjusted or not in the first stage.In Section?,we first use individual data to explore the relationship between platelet count(PLT)and lung cancer(LC)risk based on the 2SLS method of MR.In the first stage regression,four SNPs that meet the IV assumptions were screened to construct a prediction model for PLT,and the second stage regression model was used to analyze the relationship between the predicted PLT and the risk of lung cancer.In UKB dataset,although the MR results show a positive association between PLT and LC,non-small cell lung cancer(NSCLC),and lung adenocarcinoma(LUAD)risks,it is not significant.Since the variation of PLT explained by the selected four instrumental variables is very small(?R~2 is about 0.007),it is difficult to obtain sufficient power based on the current sample size in UKB(11008,case:control=1:4).We then proceeded with the second stage regression analysis on the large sample size(32348people)from Onco Array/TRICL lung cancer case-control study.The MR results show that in LUAD,the association between PLT and lung cancer risk is significant(P=0.0088),and the OR value is about 1.66.The risk of LUAD will increase by about66%with increasing one standard deviation of PLT.In LC,the association between PLT and the risk was marginally significant(P=0.0878),and the OR was around 1.3,while in NSCLC and lung squamous cell carcinoma(LUSC),there is no significant association between PLT and risks(P=0.1342,P=0.7865).When the samples'age distribution range in Onoc Array/TRICL was adjusted to be in accordance with them in UKB,the results of LC,NSCLC,and LUAD were consistent with the previous,but the elevated PLT was found to be a protective factor in LUSC.Although both LUAD and LUSC belong to non-small cell lung cancer,the heterogeneity between couldn't be neglected.The mechanism of PLT in LUAD and LUSC needs further study.Furthermore,we also used the two-sample MR based on summary-data to further study the relationship between PLT and LC risk,and found that there is a significant positive association.For each 100×10~9/L increase in PLT,the risk of lung cancer increases by60%-About 70%.In Section?,a summary and prospect of the key assumptions in the MR method and the problems encountered in its application,as well as personal understanding of MR study.
Keywords/Search Tags:mendelian randomization, two-stage least square, instrumental variable, lung cancer
PDF Full Text Request
Related items