Font Size: a A A

Low-frequency Genetic Variants Are Associated With Lung Cancer Susceptibility And Survival In Han Chinese

Posted on:2018-10-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:M ZhuFull Text:PDF
GTID:1314330515993902Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Lung cancer(LC)is among the most frequently diagnosed cancers and is the leading cause of cancer death worldwide according to the report from World Health Organization(WHO)in 2012.In China,lung cancer has been the most common incident cancer in males and the second most common incident cancer in females for several years.Moreover,the incidence and mortality of lung cancer have been increasing rapidly in most areas of China during the past decades owing to the continuous increase of tobacco consumption and environmental pollution.Now,lung cancer has become one of the greatest threats to public health and needed to be resolved imminently.Epidemiological studies have shown that the development of lung cancer is result from a complex interplay between genetic and environmental factors.Even though evidences indicate that over 80%of lung cancer cases in males and over 50%in females can be attributed to tobacco consumption,but only a small fraction of smokers(usually<20%)finally developed into lung cancer,suggesting that in exposing to these environmental risk factors,host genetic factors determine an individual's predisposition to LC.Surgery,radiotherapy and chemotherapy are the main therapeutic schedule for LC,and the prognosis prediction and classification are still primarily based on the histological types and TNM staging system.However,in practice it frequently found that the prognoses of LC patients were significantly different even with the same histological type,clinical stage and the same treatment.This suggested that the genetic background might be involved in the therapeutic response and prognosis of LC.Therefore,it has been a hot research field for identifying genetic factors involved in susceptibility and prognosis of LC,which can help identify high risk people,guide individual prevention and treatment of LC.Genome-wide Association Study(GWAS)has been one of the powerful tools for genetic study of complex diseases in recent years.It can detect hundreds of thousands of common genetic variants at the same time.With large sample size and multiple stage validation,the results are more reliable and repeatable compare with other strategies.To date,GWAS has achieved considerable success in deciphering the genetic basis of susceptibility and prognosis of LC.With GWAS strategy,we have identified several new loci,such as 5q32,10p14,13q12.12,20q13.2 and 22q12.2,whichare associated with the susceptibility of LC in Chinese,and also validated two loci(3q28 and 5p15.33)from previous studies.For the prognosis GWAS of LC,we also identified several loci,including 3p22.1,4q26,5p14.1,7q31.31,9p21.3 and 14q24.3,associated with prognosis of LC.Results from these studies provided us valuable clues on the etiology research,evaluation of curative effect and prognosis of LC.However,GWAS studies mainly based on the hypothesis of "Common disease/Common variants",which used tag SNPs to represent all common variants in the genome through linkage disequilibrium analysis.These tag SNPs usually have high frequency but low penetrance,which result in "missing heritability" in several disease.To help resolve these questions,the hypothesis of "Common disease/Rare variants" is in the ascendant.These low-frequency or rare variants usually have high penetrance and are important complement to the "missing heritability".High throughput sequencing is the ideal strategy for low frequency genetic variation association study.However,the high price of sequencing limits the wide application of this scheme.In order to provide a viable technology platform for systematically exploring of low-frequency genetic variation,Illumina developed a new high-throughput chip which based on the whole-genome sequencing and exome sequencing of 12,031 individuals from deifferent races(including Asians,Europeans,African Americans,and Americans).More than 240 thousands genetic variants were included in this chip,of which about 93.1%were located at the exons and 89.5%were low frequency variants.Therefore,to explore the association between low-frequency or rare variants and the susceptibility of LC,we performed the first exome-wide association study of LC in Chinese population using the Infinium(?)HumanExome Beadchip.Furthermore,we integrated the survival and clinical information of these LC patients and evaluated the association between these low-frequency or rare variants and the prognosis of LC.The results of our study will be helpful for perfecting the genetic susceptibility and prognosis map of LC in Chinese population and improving the understanding of LC.Part I:Low-Frequency Coding Variants at 6p21.33 and 20q11.21 are Associated with Lung Cancer Risk in Chinese PopulationGenome-wide association studies have been proved to be a powerful tool for the study of complex traits.With GWAS strategy,we and others have successfully identified about 30 common variants associated with lung cancer risk.However,these variants explain only a fraction of lung cancer heritability.GWAS usually focus on common variants while ignore the contribution of low-frequency or rare variants.While it has been proposed that low-frequency or rare variants might have strong effects and contribute to the missing heritability.Recently,Wang et al reported two large-effect,low-frequency variants in BRCA2(p.Lys3326X,MAF=0.009,OR=2.47,P = 4.74×10-20)and CHEK(p.Ile157Thr,MAF=0.007,OR=0.38,P=1.27×10-13),implicated in susceptibility to lung cancer in Europeans based on existing GWAS imputation data.This study demonstrated the important role of low frequency genetic variations in the susceptibility of lung cancer.However,there is no study focusing on the genetic association between low-frequency genetic variations and the susceptibility of Chinese Han by now.To assess the role of low-frequency or rare variants in lung cancer development,we analyzed exome chips representing 1,348 lung cancer cases and 1,998 controls at the discovery stage and subsequently evaluated promising associations in an additional 4,699 cases and 4,915 controls at the replication stages.Systematic quality control of the raw genotyping data was performed to filter unqualified genetic variants and samples,and finannly 72,423 variants in 1,341 cases and 1,982 controls were retained for further association analysis.For the identified variants,we also assessed its influence on onset age of LC.The TCGA data was also used to evaluate the difference expression of the identified susceptibility genes in tumor and the adjacent normal tissues.Single-variant and gene-based analyses were carried out for coding variants with minor allele frequency less than 0.05.We identified three low-frequency missense variants in the BAT2(rs9469031,encoding p.Pro515Leu;OR=0.55,P=1.28×10-10),FKBPL(rs200847762,encoding p.Pro137Leu;OR=0.25,P=9.79×10-12)and BPIFB1(rs6141383,encoding p.Val284Met;OR=1.72,P=1.79×10-7),which were associated with lung cancer risk.The rs9469031 in BAT2 and rs6141383 in BPIFB1 were also associated with the age of onset of lung cancer(P =0.001 and 0.006,respectively).Gene-based analysis revealed that FKBPL,in which two independent variants were identified,might account for the association with lung cancer risk at 6p21.33.(SKAT-O Test:P=1.29×10-9;Burden Test:P=2.00×10-10).Based on TCGA database,we found BAT2 and FKBPL at 6p21.33 and BPIFB1 at 20q11.21 were differentially expressed in lung tumors and paired normal tissues.Our results highlight the importance of low-frequency variants for lung cancer susceptibility and indicate potential biological relevance of candidate genes at 6p21.33 and 20q11.21 in lung carcinogenesis.Part ?:Exome-Wide Association Study Identifies Low-Frequency Coding Variants in 2p23.2 and 7p11.2 Associated with Survival of Lung Cancer PatientsUsing a GWAS stragety,several studies have been performed to access the associations between common genetic variants and the overall survival of lung cancer,and multiple loci have been identified to be associated with lung cancer survival,including 2p23,2q22,2q34,3p22,5p14,6p21,7q31,9p21,9p22,11q22,12q23,13q33,14q24,15q21,16q21,19p13 and 21q22.However,these studies usually focus on common genetic variants,the association between low frequency genetic variants and lung cancer survival is still unclear.As a result,we performed the exome-wide association study of non-small cell lung cancer(NSCLC)survival using the exiting genotype data.In this study,we used a case-only design.Among the 1348 lung cancer cases genotyped using Illumina HumanExome Beadchip,1008 patients with completed clinical and survival information were used in this study.After systematically quality control,a total of 57,903 variants in 1001 cases were used in the following association analysis with Cox model.For missense or splicing vairants with P<1×10-3,we further replicated these associations using imputed genotype data of 773 NSCLC patients from TCGA with a cox regression model.As only a part of the low frequency variants were imputed qualifiedly,we also used the expression data of host genes to validate our findings in an indirect way.Gene-based and pathway-based analyses were also performed based on non-synonymous or splice-site variants using R packaegs "coxKM" and "ARTP".After two-stage analysis,we found a low-frequency missense variant in CCT6A(rs33922584:hazard ratio(HR)=1.75,P=6.06×10-4)was significantly associated with NSCLC patients' prognosis,which was further replicated by TCGA samples(HR=4.19,P=0.015).Interestingly,the G allele of rs33922584 was significantly associated with high expression of CCT6A(P=0.019)that might induce the worse survival in TCGA samples(HR=1.15,P=0.047).Besides,rs117512489 in PLB1(HR=2.02,P=7.28×10-4)was also associated with NSCLC patients' survival in our samples,but only supported by gene expression analysis in TCGA(HR=1.15,P=0.023).The rs33922584(encoding p.Arg1131Gln)might damage the structure of CCT6A and influence its phosphorylation,and expression analysis based on TCGA showed CCT6A was significantly unregulated in tumor tissues of NSCLC.Gene-based and pathway-based analysis revealed a total of 32 genes.including CCT6A and several potential pathways that might account for the survival of NSCLC.These results provided more evidences for the important role of low-frequency or rare variants in the prognosis of NSCLC patients,and indicate variants at 2p23.2 and 7p11.2 are independent markers for survival of lung cancer.
Keywords/Search Tags:lung cancer, low-frequency variant, genetic susceptibility, survival, exome-wide association study
PDF Full Text Request
Related items