Font Size: a A A

Chromatin Accessibility Profiles In Non-small Cell Lung Cance

Posted on:2021-11-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:K LuoFull Text:PDF
GTID:1524306551491724Subject:Genetics
Abstract/Summary:PDF Full Text Request
Background and purposeLung cancer is the most common cause of cancer death worldwide,causing approximately 1.6 million deaths annually,with a low five-year survival rate of approximately 15.9 per cent.About 85% of lung cancer cases are classified as nonsmall-cell lung cancers(NSCLC),with lung adenocarcinoma(LUAD)and lung squamous cell carcinoma(LUSC)being the two most common subtypes.Mountains of studies have investigated the pathogenesis of LUAD and LUSC from various histological perspectives,including genomic,transcriptomic,epigenomic and proteomic.Mutations in TP53 and KRAS are common to different histological subtypes of NSCLC.However,most of the mutations contained in different histological subtypes are different;U2AF1,RBM10 and SF3B1 are mainly mutated in lung adenocarcinoma,while mutations related to squamous cell differentiation,including overexpression and amplification of SOX2 and TP63,deletion mutations of NOTCH1,NOTCH2 and ASCL4,and focal deletion of FOXP1 are mainly in lung squamous carcinoma.Recent studies have reported advances in epigenetic treatment of lung cancer.The Molecular Oncology Research Group at the Maxell Bloch Center for Molecular Medicine has found that EZH2,the gene encoding the histone lysine N-methyltransferase de novo,inhibits tumor gene expression by participating in histone methylation processes.Significant inhibition of lung cancer progression through combined application of EZh2 inhibitors and anti-inflammatory drugs.Low-dose German adjuvant epigenetic inhibitors(AET)enable "bone marrowderived suppressor cells" involved in tumor metastasis to downregulate the expression of CCR2 and CXCR2,thereby inhibiting lung cancer metastasis.Phosphoproteomic studies have shown that activation of the MAPK signaling pathway occurs in the majority of NSCLC cases without KRAS mutations.A subset of lung cancer patients were clinically found to have no apparent driver gene mutations,and TCGA’s multi-omic study of LUAD noted that only a small percentage of cases had known mutations that explained altered protein activity in the MAPK and PI(3)K pathways,suggesting an additional,unexplained mechanism of pathway activation.Although tumors are initiated by mutations in driver genes,malignancy does not form whenever mutations occur,suggesting that disease progression may also be regulated by other dimensions.We have mapped genome-wide chromatin openness in two major histological subtypes of NSCLC from the perspective of the important epigenetic regulation of chromatin open regions.MethodologyTumour tissues from 50 patients with non-small cell lung cancer were collected after surgical removal.Construction of sequencing libraries of ATAC-seq,transcriptome and whole genome after digestion and homogenization of tumor tissues.Sequencing of quality-passed ATAC-seq libraries,transcriptomic libraries and genome-wide libraries using the Illumina sequencing platform to generate sequencing reads of paire-end 75 bp,150bp and 150 bp.After basic quality control of the sequencing data,reads are mapped to the reference genome,and then the chromatin accessibility regions are called from ATAC-seq data,gene expression value from transcriptome sequencing data and genetic variant loci from whole-genome data,respectively.Further sample quality control of the ATAC-seq data to screen out samples of substandard quality.Overlap analysis with ENCODE lung cancer cell line DNase-seq data and TCGA NSCLC ATAC-seq sequencing data to reveal the reliability of our ATAC-seq data and the newly discovered chromatin accessibility regions in our data.Using chromatin open area signal intensity data to perform subtyping and clinical trait correlation analysis on our non-small cell lung cancer samples.Identification of chromatin accessibility regions with intensity differences in LUAD and LUSC,and analysis of the genomic feature distribution and functional characteristics of the differentially accessibile regions.Identification of transcription factors with differential activity in LUAD and LUSC.Construction of weighted gene co-expression networks using gene expression data to identify gene modules that are significantly associated with histological subtypes of NSCLC and to perform functional analysis in conjunction with differential transcription factors.Conduct a co-accessible analysis of chromatin accessibility regions after removing confounding factors such as sex,age and smoking,and search for coaccessible chromatin open regions within the 1Mb genome to explore the effects of co-accessible open regions on gene expression and further investigate their regulatory relationship to tumour-related genes,and conduct a chromatin high-level structural analysis of co-accessible open regions to find co-accessible open regions located in the same topological domain and explore their gene regulatory significance.Integrate the SNP data from whole-genome sequencing,the chromatin open regions of ATAC-seq and the histological data from the three dimensions of gene expression and perform the analysis of ATAC-QTL and e QTL,respectively,and perform the analysis of the coincidence of ATAC-QTL and e QTL to construct the NSCLC transcriptional regulatory network from which genomic mutations can alter gene expression by altering chromatin accessibility.Analysis of trait association loci from genome-wide association analysis in lung cancer using ATAC-QTL and e QTL.Construction point mutations in lung cancer cell line H1299 using CRISPR/CAS9 gene editing technique to validate the results of ATAC-QTL and e QTL analysis.ResultsThe overlap rate between our NSCLC peaks and lung cancer-associated cell lines DNase-seq/TCGA NSCLC ATAC-seq was about 60%,indicating that most of the peaks detected in our NSCLC were also present in the cell lines and NSCLC ATACseq data,which on the one hand indicates that our ATAC-seq data are highly reliable.On the other hand,it also indicates that a large quantity of chromatin accessibility regions identified in our NSCLC population are novel.Chromatin accessibility regions in the promoter regions tend to be conservative in cohort,while chromatin accessibility regions of distal regions are more sample specific.Similarly,peaks in the promoter region of nc RNAs were found to be conserved in NSCLC samples,which may indicate that some nc RNAs play important tumor-regulatory roles in NSCLC.There are significant differences between LUAD and LUSC in terms of open chromatin characteristics,but an intermediate type of samples also exist.Compared to LUAD,there are larger number of chromatin accessibility regions with higher signal in LUSC,and the trend were supported both our NSCLC population and the TCGA NSCLC population,which may indicate that chromatin regulation in LUSC may be more active.LUAD-hyper peaks were mainly enriched around genes such as AHR,MBIP,and SLC34A2,and SLC34A2 was found to inhibit tumor growth and metastasis in lung cancer cell line A539 and lung cancer metastasis models,while LUSC-hyper peaks were enriched around TP63 and SOX2,genes specific for lung squamous carcinoma subtypes.Lung adenocarcinoma and lung squamous carcinoma are regulated by different core transcription factors.The transcription factors that are highly specific and active in lung adenocarcinoma are mainly NKX2-1,NKX2-2,NKX2-3,NKX2-5 and NKX2-8,and HNF1 A and HNF1B;whereas the transcription factors that are highly specific and active in lung squamous carcinoma are mainly TP53,TP63 and TP73,and SOX transcription factor family.Core transcription factors drive the formation of gene regulatory networks with histological subtype specificity.In the subtype significantly related gene module MEturquoise,NKX2-1 and JDP2 as core genes(hub gene)and master regulators may regulate the expression of related oncogenes.We found that the promoter region of the module gene CTA-384D8.34 contains binding sites for both NKX2-1 and JDP2,in lung adenocarcinoma samples.In the lung squamous carcinoma samples,the promoter region of the module gene GSE1 contains binding sites for both NKX2-1 and JDP2.This suggests that in different histological subtypes of non-small cell lung cancer,NKX2-1 and JDP2 may contribute to the formation of a subtype-specific gene expression network by targeting on different genes.We found that "co-occessible" chromatin regions are prevalent in both lung adenocarcinomas and lung squamous carcinomas,and that they are associated to the co-expression of genes.There is an enrichment of "co-occessible" regions in the gene regulatory regions of tumor-associated genes.A total of 180 pairs of "co-occessible" regions were distributed across 108 regulatory regions of genes catalogued by Cancer Gene Census,including EGFR,TP53,MYC and ERBB2.This suggests that the "cooccessible" regions is closely related to oncogene regulation.We found that the "cooccessible" regions is highly enriched in the TAD of lung cancer cell lines A549 and IMR90.We found that genetic variation can regulate gene expression through altering chromatin accessibility.We found that SNP rs10857795 located in the GSTM1 intron and annotated as non-functional in previous studies correlated with the level of gene expression in GSTM1 and the strength of the ATAC-seq peak open signal in the promoter region of this gene,suggesting that SNP rs10857795 may increase the risk of developing NSCLC by regulating the expression of the GSTM1 gene.Moreover,we validated the effect of joint-QTL rs10857795 on GSTM1 gene expression in NSCLC cell line H1299 using CRISPR/Cas9 gene editing method.In addition,we identified two joint-QTLs(rs17079286 and rs73766221)capable of regulating ROS1 gene expression,located at the LD blocks where the GWAS association sites associated with lung cancer(rs9387478)and lung adenocarcinoma(rs9387479)are located,respectively,that correlate with an ATAC-seq peak signal intensity upstream of the ROS1 gene and the expression level of the ROS1 gene.This suggests that mutations in the ROS1 gene regulatory region can also cause dysfunction in the NSCLC network by affecting its gene expression.ConclusionThis study utilized ATAC-seq to characterize chromatin accessibility of the Chinese NSCLC population.Lung adenocarcinoma and lung squamous cell carcinoma exhibited distinct open chromatin patterns,and specific open chromatin regions are found to be associated with specific driver mutation genes.Open chromatin analysis identified specific driver transcription factors,NKX2-1,NKX2-8 and HNF1 B in LUAD,as well as TP63 and SOX2 in LUSC.Integrated ATAC-seq with gene expression data revealed subtype-specific transcription factors is closely associated with the formation of subtype-specific transcriptional regulatory networks.In addition,we identified 21 joint-quantitative trait loci(joint-QTL)that correlated to both assay for transposase accessible chromatin sequencing peak intensity and gene expression levels.Finally,we identified 87 regulatory risk loci associated with lung cancer–related phenotypes by intersecting the QTLs with genome-wide association study significant loci.In summary,this compendium of multiomics data provides valuable insights and a resource to understand the landscape of open chromatin features and regulatory networks in NSCLC.
Keywords/Search Tags:NSCLC, chromatin accessibility, gene regulatory network, ATAC-QTL
PDF Full Text Request
Related items