Font Size: a A A

Somatic Mutations Reveal The Pathogenesis Of Gastric Cancer Induced By The Risk Factors Exposure

Posted on:2022-01-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:T C ZhangFull Text:PDF
GTID:1484306311966909Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
BackgroundAccording to the report of the International Agency for Research on Cancer(I ARC),Gastric cancer(GC)is the fifth most commonly diagnosed cancer and the fourth cause of cancer death,and there will be about 1,089,103 new cases of GC and 768,793 estimated deaths in 2020 in the world.Based on the report of IARC,about 43.9%and 48.6%of new GC cases and deaths worldwide in 2020 would occur in China.GC ranks the third in the incidence of malignant tumors and the third in the cause of deaths in China,which highlights the urgency of preventing and treating GC.GC is a multifactorial disease that can be caused by both environmental and genetic factors.The risk factors for GC included age,gender,race,infection and imbalance of microbiota,dietary habits,lifestyles,socioeconomic status,disease and drug use,occupational exposure,genetic talent,and family history.Identifying these potentially modifiable factors makes it possible to prevent GC.Meanwhile,the rapid development and maturity of modern bioinformatics sequencing technology makes it possible to analyze the specific patterns of GC etiology from the perspective of tumor genomics.Tumor genomic mutations include germline mutations(susceptibility mutation)and somatic mutations(driver mutation).Germline mutations are inherited from parents and transmitted to offspring,and these mutations occur in tissues throughout the body.However,the somatic mutations are not inherited from parents and transmitted to offspring,and these mutations only occur in tumor tissues.Germline mutations play an important role in the development of familial tumors;while somatic mutations are the necessary condition to drive the occurrence of tumors,and related to the environmental factors exposure(such as physical and chemical factors exposure,infections of microbe,risk factors exposure,etc.).At present,the somatic mutation hypothesis is a generally accepted mechanism of tumorigenesis.Many studies using next-generation sequencing technology to assess somatic mutations in GC at the genomics level,draw somatic mutation landscapes and identify the mutational signatures,and further reveal the"subtypes" of somatic mutations in GC,which provides an important scientific basis for revealing the GC etiology and developing the targeted therapy for GC.In addition,exposure to environmental risk factors usually leaves a "mark" on the tumor gene mutations,that is,mutational signature.The mutational signature can indicate the exposure characteristics of tumor somatic mutations(physical,chemical,and biological characteristics),which provide new research ideas for revealing the etiology of tumors.At present,there are many problems in the GC etiology study that need to be solvedurgently.(1)The traditional risk factors are not enough to explain all the reasons for the occurrence of GC,and the new risk factors are still unclear.Exploring the new risk factors of GC is still important.(2)The somatic mutation is the key mechanism of most tumorigenesis(including GC).Through the high-throughput sequencing technology,we can analyze the "subtypes" of somatic mutation of GC from the genomics level.This topic has become a hotspot in GC genomics research,but the research results still need to be further enriched.(3)Different risk factor exposures are closely related to the specific somatic mutation types and mutational signatures,while this relationship needs to be further explored.Based on the above-mentioned problems,our research group has established aresearch base in Taixing(a region with a high incidence of upper gastrointestinal cancer)and implemented a population-based case-control study.We have completed the collection of exposure factors and biological samples of cases and matched controls.Based on this preliminary work,in this study,we used the data of this population-based case-control study to assess the associations between known risk factors and risk of GC,and fully assess the associations between newly identified potential risk factors(oral health)and risk of GC.Meanwhile,we assessed the GC somatic mutation landscapes and mutational signatures using whole exome sequencing(WES),and further assessed the "subtypes" of somatic mutation of GC from the tumor genomics level.Moreover,based on the information of risk factors exposure,we further evaluated the potential relationship between"subtypes" of GC somatic mutations and GC risk factors exposure,and explored whether exposure to risk factors ultimately leads to GC through specific somatic mutations.The results of this study can provide a strong scientific basis for the formulation of prevention strategies in high incidence areas of GC,and provide important theoretical support for the etiology research and precise treatment of GC.Objectives1.To evaluate the effects of known risk factors exposure(oral health)on GC risk,and comprehensively assessd the associations between newly identified potential risk factors(oral health)and risk of GC from multiple levels.2.Based on WES,we comprehensively evaluated GC somatic mutation types,drew somatic mutation landscapes,and identified the mutational signatures.We intended to identify the "subtypes" of GC somatic mutation from the tumor genomics level.3.Based on the data of risk factors and somatic mutations,we comprehensively evaluated the potential relationship between "subtypes" of GC somatic mutations and GC risk factors exposure,and explored whether exposure to risk factors ultimately leads to GC through specific somatic mutations.Methods1.Identification of risk factors for GCOur research group has established a research base in Taixing(a region with a high incidence of upper gastrointestinal cancer)and implemented a population-based case-control study.Based on this preliminary work,in this study,we used the data of this study to fully assess the associations between risk factors exposure and the risk of GC.We aimed to collect all the newly diagnosed GC cases from October 2010 to September 2013 in the endoscopy units of the four largest local hospitals in Taixing,and further supplemented the missing GC cases by using the local Cancer Registry system.At the end of each year,potential control subjects were randomly selected from the Taixing Population Registry system using the frequency-matching method(matched by sex and 5-year age group).Trained local interviewers conducted questionnaire information collection(including general demographic data,occupation and family socioeconomic status,smoking and alcohol drinking,oral health,family history,and some other basic factors)and collected biological samples(such as blood and tumor tissues).Serum samples from cases and controls were used to measure Helicobacter pylori(H.pylori)IgG antibodies qualitatively by immunoblotting assay.The basic characteristics and exposure factors of cases and controls were described and compared by using Wilcoxon Rank Sum tests and Chi-squared tests.Unconditional logistic regression models were used to derive odds ratios(ORs)and 95%confidence intervals(CIs)of risk factors for GC risk.When assessing the association between oral health and risk of GC,we performed an interaction analysis(multiplicative interaction model)between known risk factors(age,sex,tobacco smoking,alcohol drinking,and H.pylori infection)and the indicators of oral health to evaluate the modification effects.Furthermore,to assess the potential influence of non-differential misclassification,we performed a sensitivity analysis by excluding the cases that were recruited from the local Cancer Registry.All the analyses were two-tailed,and P<0.05 was considered statistically significant.All analyses were performed using SAS 9.4 software.2.Somatic mutation landscapes and mutational signatures of GCThe samples used in the study were all from our population-based case-control study.The paraffin sections of tumor tissues and paired blood DNA of 100 cases of GC were selected to perform the WES.The entire experimental process of WES included three steps:sample DNA extraction,library construction,and sequencing.After sequencing,based on the human reference genome(hg38)of UCSC,the raw data were analyzed by bioinformatics analysis to identify somatic mutations.The basic principles were:(1)After sequencing,sequencing results of tumor tissue and the corresponding peripheral blood were compared with the human reference genome.At this time,the mutations identified by the tumor tissue included germline mutations and somatic mutations;while the mutations identified by the peripheral blood only included germline mutations.(2)By comparing the two groups of the sequencing results,germline mutations were filtered out,and somatic mutations were identified.The methods of bioinformatics analysis were as follows:(1)Using FastQC software,the quality control and filtering of two groups of raw data were carried out.(2)The filtered data(clean data)was compared with the human reference genome(UCSC,hg3 8)by B WA software(Burrows-Wheeler Alignment Tool),and then sequenced and filtered.(3)The detection and filtering of somatic mutation of GC samples were performed by the GATK4.1 mutect2 software.Then,we identified the somatic SNP and InDel.In the process of somatic mutation screening,the LearnReadOrientationModel was used to modify the results for the base preference problems that may be caused by paraffin samples.(4)The ANNOVAR software was used to annotate the identified somatic SNP and InDel sites to determine the gene information corresponding to the mutation.After filtering the annotated somatic mutation information,MATLAB R2019b software was used to run MutSigCV 1.41 to identify the significantly mutated genes.Meanwhile,the ggplot2,maftools,ggfortify,Biostrings,NMF,and sigminer packages of the R program(Version 3.6.2)were used to perform the visualization of the results and the identification of mutational signatures.3.Associations between GC risk factors exposure and somatic mutations of GCWe collected detailed information about GC risk factors exposure and clinical features based on our previous study,and we also obtained the GC somatic mutation information identified by WES.In this part of our study,we further evaluated the associations between GC risk factors exposure(such as age,gender,smoking,drinking,H.pylori infection,and oral health),clinical information(Lauren classification and TNM staging)and different GC somatic mutation "subtypes"(Base mutation types,mutated genes,and mutational signatures).All analyses were performed using R software(version 3.6.3).We used the mean,standard deviation,and quartile to describe the proportions of 6 types of somatic base mutations that affected protein function under different GC risk factors exposure,and performed normality tests.Pearson correlation analysis was used to evaluate the associations between age and somatic base mutations.When analyzing the associations between other GC risk factors exposure and somatic base mutations,the Wilcoxon rank test was used for comparison between the two groups;the Kruskal-Wallis rank sum test was used for multiple groups' comparisons,and the Bonferroni method was used for pairwise comparison(corrected P-value).The Fisher exact test was used to analyze the associations between risk factors exposure and high-frequency somatic mutation genes and significantly mutated genes;an unconditional logistic regression model was used to evaluate the associations between GC risk factors exposure and mutational signatures;the forestplot packages of the R program was used to perform the visualization of the results.Results1.The associations between risk factors exposure and risk of GCIn this study,after excluding the records with incomplete questionnaire information of risk factors exposure,901 cases and 1972 controls were included in the final analysis.Family history of GC,H.pylori infection,age,education levels,wealth scores,alcohol drinking,and body mass index(BMI,10 years before interview)were associated with the risk of GC.Tooth loss was not significantly associated with an increased risk of GC(yes vs.no,OR=1.08,95%CI=0.88-1,33),however,there was a significant positive correlation between the number of filled teeth and the risk of GC.Compared with tooth brushing at least twice per day,tooth brushing once per day or less associated with an increased risk of GC(OR=2.3,95%CI=1.94-2.94).There was no significant interaction between the indicators of oral health and age,sex,tobacco smoking,alcohol drinking,and H.pylori infection.2.Somatic mutation landscapes and mutational signatures of GCWe performed WES in 100 tumor-normal pairs of GC cases.73,518 million somatic mutations were identified.SNPs were the most recognized somatic mutation,with 62,354(accounting for 84.8%),missense mutations were the main mutation type(57,223,accounting for 77.8%),and C>T mutation was the most common single base mutation(22,640,accounting for 36.3%).Based on the MutSigCV analysis,2 significantly mutated genes related to GC were identified,namely TP53(mutation frequency was 56%)and COL4A3(mutation frequency was 10%),and COL4A3 was a newly identified potential significantly mutated gene in GC.meanwhile,23 high-frequency mutation genes and 86 recurrent somatic mutations were identified.We preliminary drew the somatic mutation landscapes of GC.In addition,we further identified 3 GC mutational signatures,corresponding to Signature 1(Cosine coefficient=0.892),Signature 3(Cosine coefficient--0.744)and Signature 5(Cosine coefficient=0.865)in the COSMIC.3.The associations between GC risk factors exposure and somatic mutationsAccording to the above-mentioned information of GC somatic mutations,GC was divided into different "subtypes".We analyzed the associations between GC risk factors exposure and "subtypes" of GC somatic mutations.The results found that compared with patients ?60 years old,the proportion of C>T mutations was higher(P=0.025)and the proportion of T>A mutations was slightly lower(P=0.045)in those aged>60 years old.The proportion of T>A mutations in smokers and those with higher tobacco smoking was slightly lower(P=0.035),the proportion of C?G mutations in tobacco-exposed patients was slightly lower(P=0.043).Compared with diffuse GC patients,the proportion of T>C mutations in intestinal GC patients was slightly lower(P=0.047).Compared with stage I patients,the proportion of C>A mutations in stage ?\?\?patients was slightly lower(P=0.037),and the proportions of T>C(P=0.035)and T>G(P=0.043)mutations were significantly increased.We also explored the associations between risk factors exposure and high-frequency and significantly mutated genes,and found that the proportions of TP53 gene(P=0.048)and DNHD1(P=0.043)gene mutations were significantly increased in persons>60 years old;the proportions of OBSCN gene mutations was significantly decreased in persons with tobacco exposure(P=0.026);TNM staging was significantly related to the proportions of FSIP2(P=0.041),HMCN2(P<0.001),and DNAH2(P=0.021)gene mutations.In addition,we did an in-depth association analysis between risk factors exposure and Signature 5.After adjusting for other risk factors,tobacco exposure was significantly correlated with Signature 5,the risk of Signature 5 increased by about 5 times for the patients exposed to tobacco(OR=5.09;95%CI=1.06-24.48).Conclusion1.Except for known risk factors[family history of GC,H.pylori infection,age,education levels,wealth scores,alcohol drinking,and BMI(10 years before interview)],positive associations between poor oral hygiene habits,some indicators of poor oral health status,and risk of GC has been confirmed.2.Based on the WES results,we preliminary drew the somatic mutation landscapes of GC and identified 2 significantly mutated genes related to GC(TP53 and COL4A3),and COL4A3 was a newly identified potential significantly mutated gene in GC.Meanwhile,we further identified 3 GC mutational signatures,corresponding to Signature 1,3,and 5 in the COSMIC.3.Significant associations between risk factors exposure and GC somatic base mutations and high-frequency mutation genes were observed.Tobacco exposure was significantly correlated with the occurrence of Signature 5,the risk of the occurrence of Signature 5 increased by about 5 times for the patients exposed to tobacco.Innovations1.Based on a strictly established population-based case-control study,we evaluated the relationships between a variety of risk factors exposure and the risks of GC,and comprehensively assessed the association between newly identified potential risk factors(oral health)and risk of GC from multiple levels.2.By the use of WES,we comprehensively evaluated GC somatic mutation types,drew somatic mutation landscapes,and identified the mutational signatures from the tumor genomics level.We further identified the "subtypes" of somatic mutation of GC,which provided a theoretical basis for developing the targeted therapy for GC.3.We comprehensively evaluated the potential relationship between "subtypes" of GC somatic mutations and GC risk factors exposure,and explored whether exposure to risk factors ultimately leads to GC through specific somatic mutations.The results provided a new research idea and method that could be used for reference in the etiology research of GC.
Keywords/Search Tags:Gastric cancer, Risk factor, Somatic mutation, Mutational signature, Etiology study
PDF Full Text Request
Related items
Genetic Determinants Of The Somatic Mutational Processes In Cancers Reveal Potential Driver Genes Of Cancer Evolution
Construction Of Landscape Of Somatic Mutation,and Functional Mining And Verification Of Representative Ponint Mutations In Gastric Cancer
Differential Methylated Probes At Pan-cancer Level And Association Analysis With Gene Expression,Mutational Signatures And Immune Signatures
Establishment Of The Prognostic Risk Signature Based On GEO And TCGA Database Combine With The Relationship Between CTNNAL1 Gene Expression And Clinicopathological Factors/Prognosis In Gastric Cancer
Tumor Mutational Burden Predicts The Efficacy Of Immune Checkpoint Inhibitors And Screening Of Related Mutation Genes In Gastric Cancer
The Signature Of Recurrence Risk For Stage ?-? Gastric Cancer Patients After Surgical Resection And Signature For Patients Treated With 5-Fluorouracil-Based Chemotherapy
Interaction Analysis Between Germline Susceptibility Loci And Somatic Alterations On Gastric Cancer
Identification of Non-Random Somatic Mutation Clustering While Accounting for Protein Tertiary Structure: Extensions, Novel Methodologies and Applications to Identifying Oncogenic Driver Mutations
A Study On LncRNA Signature As Biomarker Of Gastric Cancer And The Mechanism Of PHF10 Involving Gastric Carcinogenesis Through Inhibition Of Gastric Epithelial Cell Differentiation
10 The Role Of APOBEC Mutation Signature In Occurrence And Development Of Colorectal Cancer And Molecular Mechanism