Construction Of The Epigenome Map Of Normal Prostate Based On Multi-omics Sequencing Data And Exploration Of Epigenomic Driving Mechanisms In Neuroendocrine Prostate Cancer | Posted on:2023-11-11 | Degree:Doctor | Type:Dissertation | Country:China | Candidate:T Wang | Full Text:PDF | GTID:1524306905995199 | Subject:Surgery (Urology) | Abstract/Summary: | PDF Full Text Request | BackgroundEpigenetics refers to the heritable changes in gene expression patterns and functions without changes in the DNA sequence of genes,which ultimately lead to phenotypic changes[1].It mainly involves multiple research directions,including DNA methylation,histone modification,energy-dependent chromatin remodeling,dynamic programming of nucleosomes,and non-coding RNA[2]The close coordination and fine regulation of various epigenetic modifications at different levels and spatial dimensions establish and maintain the spatio-temporal specific expression of cell lineages genes,which play a key role in organ and tissue development,destiny determination,tumor differentiation,cell senescence and stem cell self-renewal.With the popularization of second-generation sequencing technology,epigenome,a systematic and comprehensive study of epigenetic modification maps at the whole genome level,is also flourishing,such as NIH’s Roadmap project[3]integrates multiple omics epigenetic sequencing data(including chromatin accessibility sequencing,DNA methylation sequencing,transcriptome sequencing,and multiple histone modified immunoprecipitation sequencing)to separate chromatin into different functional states to explore its function.As the epigenome is tissue-specific,the roadmap project will eventually yield up to 111 normal reference epigenomes for different tissues,providing rich data sets and reference standards for further studies on tissue development,stem cell plasticity,apparent drivers of diseases and transdifferentiation of tumors.The prostate,as an important male reproductive organ,is not included in the plan,and there is still a lack of a comprehensive epigenome of normal prostate tissue.Therefore,the construction of the epigenome of normal prostate tissue is of great significance in exploring the development of prostate tissue,maintenance of functional phenotypes,and abnormal molecular driving mechanisms of prostate related diseases,especially prostate cancer.Prostate Cancer is one of the most common tumors of the male reproductive system.Worldwide,the incidence of Prostate Cancer is second only to lung Cancer,and ranks second among male malignant tumors[4].Advanced prostate cancer is the leading cause of death from prostate cancer.Neuroendocrine Prostate Cancer(NEPC)is a special pathological type of Prostate Cancer and also causes castration-resistant Prostate Cancer.One of the main causes of CRPC,the prognosis is very poor,most patients die within 1-2 years after diagnosis,and the 5-year survival rate is less than 1%[5].NEPC usually has a poor effect on Androgen Deprivation Therapy(ADT),and currently there is no effective treatment.Platinum drugs combined with etoposide are mainly used for clinical treatment[6].Therefore,it is of great significance to study the molecular diagnosis,driving mechanism and potential therapeutic targets of NEPC.In recent years,more and more studies have confirmed that epigenetic abnormalities have an important impact on NEPC differentiation and lineage remodeling[7,8].NEPC and ADPC can share the same genomic changes,but their phenotypes are very different,suggesting that the mechanism of lineage remodeling,mainly regulated by epigenetics,may be a key factor in the progression of NEPC[9].Multiple studies have confirmed the existence of hypermethylation of tumor suppressor gene SPDEF during the progression of NEPC,which is related to NE marker expression[10].In addition,transcription factor EZH2 is a histone methylation transferase,which is often highly expressed in NEPC.Studies have shown that abnormal methylation mediated by EZH2 is involved in regulating tumor apparent reprogramming and promoting the occurrence of NEPC[11,12].However,due to the extremely complex molecular mechanism of NEPC,the exact pathogenesis has not been clarified,and how abnormal DNA methylation affects the apparent reprogramming of NEPC is still unknown.Moreover,previous studies on DNA methylation of NEPC are mostly limited to some genes or some CpG sites,with low resolution and small coverage.Single base resolution DNA methylation maps cannot be obtained from the whole genome.In addition,epigenetic regulatory factors such as FOXA2,NEUROD1 and BRN2 have also been shown to be related to the occurrence of NEPC in recent years[13,14],but the specific regulatory mechanism remains unclear.At present,the apparent driving mechanism of NPEC still needs to be further explored by a large number of scientific studies.Therefore,in view of the above points,we first used second-generation hi gh-throughput sequencing technology to obtain multi-omics epigenomic data set s of normal prostate tissue,so as to construct the epigenomic map of normal prostate tissue and fill this gap.Then,based on the constructed epigenomic m ap,we performed genome-wide methylation sequencing on NEPC and ADPC,and the DNA methylation map of NEPC and ADPC at the single-base resoluti on level were constructed.To further study the global abnormal DNA methylat ion of NEPC and identify the possible epigenetic driving molecules and regula tory mechanisms,finally,single cell transcriptome sequencing was performed in the high heterogeneity of tumor of prostate cancer,not only can analyze NEP C transcription characteristics of different subsets,also can be NEPC transcripti on spectrum characteristics and mechanism of epigenetic regulation together,fo r spectrum remodeling mechanism of neuroendocrine prostate cancer and appar ent molecular provide new insights and direction.Section Ⅰ Acquisition and quality control of multi-omics epigenomic datasets of normal prostate tissueObjectives1.Optimize the immunoprecipitation experiment conditions of prostate tissue;2.Access to high-quality multi-omics epigenome data sets,Including six common histone modifications(H3K4me1,H3K4me3,H3K9me3,H3K27me3,H3K27ac,and H3K36me3),chromatin immunoprecipitation sequencing(ChIP-seq),chromatin accessibility sequencing(ATAC-seq),whole genome methylation sequencing(WGBS),and chromatin immunoprecipitation sequencing(ChIP-seq)Transcriptome sequencing(RNA-seq).Methods1.Bioruptor? Plus ultrasonic breaker was used to interrupt chromatin and agarose gel electrophoresis to detect fragment length;2.The gradient condition combination of IP experiment was set to explore the effects of different formaldehyde fixed concentration,fixed time and ultrasonic power on chromatin interruption;3.The results of ChIP experiment were verified by ChIP-qPCR to observe the enrichment in the target region;4.FastQC and MultiQC were used to calculate the sequencing quality,Raw Reads quantity,Q20,Q30 base mass percentage and GC content of raw data;5.Trimmomatic was used to filter the original data,remove the low-quality bases,and control the quality again;6.Bismark was used for WGBS,STAR was used for RNA-seq,BWA-MEM was used for ChIP-seq and ATAC-seq,to map the sequencing reads to hg19 genome.7.Samtools Flagstat was used to count and compare quality control indicators,Picard was used to count and mark or remove duplicate Reads;8.The SPP command in Phantompeakqualtools was used to calculate the relative chain coefficient and calculate the NSC and RSC values of ChIP-seq;9.ChIPQC was used to calculate FRiP,SSD,etc.The length distribution and TSS enrichment fraction of the inserted fragment were detected by ATACseQC.Bismark2report was used to generate WGBS quality control report to check whether there was methylation bias(M-bias).10.MethylExtract was used to calculate the bisulfite conversion rate and observe whether the conversion efficiency of methylated library was complete.11.DeepTools was used to correct the base positions of ATAC-seq;12.Genome-wide signal peaks were obtained for all ChIP-seq and ATAC-seq using MACS2.Results1.After gradient optimization,the optimal conditions for ChIP in prostate tissue were determined:formaldehyde fixation time:10 minutes;Final concentration of formaldehyde 1.5%;Ultrasonic crusher:30 cycles,10s on,10s off.Good results.2.The results of ChIP-qPCR showed that ChIP was enriched specifically in the target area,with the range of%of Input from 1.1%-19.3%and the average enrichment rate from 9.2-87.6.3.The percentage of Q20 and Q30 bases in all sequencing data was above 95%;4.The comparison results of all sequencing data were satisfactory,with the comparison rates of 85.4%-96.2%,unique comparison rates of 79.4%-93.0%,and repeat Reads rates of 8.92%-24.26%.The number of available Reads for ATAC and ChIP double-terminal sequencing data was great than 40M,higher than ENCODE’s quality control standard.The sequencing depth of WGBS data was 30×,which met the requirements of subsequent analysis.5.The FRiP range of all ChIP-seq data was 22%-63%,indicating that most of the sequenced Reads were distributed within the signal peak,NSC>1.05 RSC>0.8,Reads distribution conforms to biological characteristics.6.The length of ATAC-seq inserted fragment was correlated with nucleosome,and the distribution of ATAC-seq inserted fragment was stepped and decreased successively,and there was obvious enrichment signal at the transcription start site TSS with enrichment factor greater than 10.7.The bisulfite conversion rate of WGBS is above 99.8%,and Reads2 has a significant methylation bias(M-bias),which can be eliminated after correction.8.The biological repeatability of all sequencing data was good.Taking ChIP-seq data of H3K27ac as an example,it was highly correlated with public data.SummaryIn this part,multiple omics epigenetic data sets of normal prostate tissues were obtained,including ChIP sequencing,chromatin accessibility sequencing,transcriptome sequencing and whole genome methylation sequencing of six histone modifications(H3K4me1,H3K4me3,H3K9me3,H3K36me3,H3K27me3 and H3K27ac).Strict quality control was carried out on the sequencing library and on-board data.High-quality sequencing data provided the basis and data support for the subsequent construction of normal prostate epigenome map.Section Ⅱ:Construction of an epigenomic map of normal prostate tissueObjectivesThe epigenetic reference map of normal prostate tissue was constructed by integrating the multi-omics epigenome sequencing data set collected in the first part and using ChromHMM algorithm.Methods1.Visualization of ChIP-seq and ATAC-seq signals using deepTools;2.BEDtools was used to overlap CpG loci with annotations of each functional region,and then the average methylation level of each functional region,such as CpG island,intron and intergenic region,was calculated.R language and Python were used to visualize the data.3.Divide the whole genome of hg19 into non-overlapping bins with a window of 200bp;4.Based on hidden Markov model and ChromHMM,the prostate genome was divided into 18 different chromatin states.5.Neiborhoodenrichment command of ChromHMM was used to calculate the enrichment fraction of different chromatin states in each functional region,and calculate the methylation level and chromatin openness of different chromatin states.6.Visualized the constructed prostate epigenome using epigenome browser;7.Using SRA Toolkit,ChIP-seq data of H3K27ac,AR and FOXA1 of normal prostate in GSE130408,GSE130408 and GSE70079 were downloaded and standardized analysis was conducted;8.According to gaussian mixture model,all genes in prostate transcriptome sequencing were divided into expression genes and suppressor genes,and the difference of epigenetic signal of these two groups of genes was investigated by deepTools.9.MethylSeekR was used to identify UMRs and LMRs,and sets Cutoff to contain at least 5 CpGs,FDR<0.05,methylation value<0.5;10.Hierarchical clustering was used to explore the epigenome correlation between prostate tissue and other tissues,and multidimensional scale mapping was carried out.K-means clustering was used to help explain the tissue specificity of prostate enhancers.11.Homer software was used to identify and analyze transcription factors that may be related to prostate tissue development.Results1.The genome-wide mean methylation level of normal prostate tissue was(80.1±1.1)%,in which hypomethylation(methylation level<0.25)CpG sites accounted for(5.4±0.7)%,intermediate methylation level(0.25-0.75)CpG sites accounted for(20.1±2.4)%,most of the other CpG sites were hypermethylated(>0.75);2.In the whole prostate genome,13,565 unmethylated regions and 65,800 hypomethylation regions were identified.According to chromatin accessibility,ATAC-seq peaks were divided into promoter proximal ATAC Peaks(n=13,553)and promoter distal peaks(n=27,840).3.Prostate epigenome was successfully constructed and the chromatin of prostate tissue was divided into 18 different states.Among them,promoter activation-related states(TssA,TssFlnk,TssFlnkU,TssFlnkD)accounted for 1.86%of prostate genome,a total of 70,481,which were significantly enriched at TSS,EnhG1,EnhG2,EnhA1,EnhA2(105,593 EnhA2)was 2.92%of the prostate genome,and was enriched in the upstream and downstream of TSS,which was consistent with the distribution of enhancers.Promoters and enhancers accounted for 4.78%of the genome.4.Open and share the prostate epigenome and related data on Epigenome Browser to view the address:https://epigenomegateway.wustl.edu/browser/?genome=hg19&hub=https://epigenome.wustl.edu/normal_prostate_epigenome/hub;5.Prostate epigenome has obvious tissue specificity,only 18.9%of active enhancers are shared among different tissues,7,580 prostate tissue specific enhancers have been identified,which are closely related to the expression of 103 prostate tissue specific genes.6.SNP sites associated with prostate cancer are only enriched in prostate tissue enhancers(P<0.05),and had no significant enrichment with other tissue enhancer.Among them,the change of C/C→C/G of SNP RS17694493 site is likely to destroy THE DNA binding site of AR,increase the transcription of CDKN2B-AS1,and promote the occurrence of prostate cancer.SummaryThe ChromHMM algorithm was used to construct the epigenetic reference map of normal prostate tissue by integrating multiple epigenetic data.The whole genome of normal prostate tissue was divided into 18 chromatin states,and the differences of chromatin openness,DNA methylation and transcription levels in different states were discussed.We also studied the tissue specificity of prostate epigenome and identified 7,580 tissue specificity enhancers,which are closely related to prostate tissue development.Section Ⅲ:The changes and significance of genome-wide methylation in neuroendocrine prostate cancer based on epigenome mapObjectivesCombined with the normal prostate reference genome map obtained in Section II,the abnormal characteristics of DNA methylation in NEPC were observed,and the influence of changes in DNA methylation on the formation mechanism of NPEC was deeply discussed from global to local.Methods1.WGBS database construction and sequencing of collected NEPC and ADPC samples;2.FastQC and MultiQC were used to calculate the sequencing quality,Raw Reads quantity,Q20,Q30 base mass percentage and GC content of Raw data.3.Use Trimmomatic to filter the original data,remove the low-quality bases,and control the quality again;4.Bismark was used to map sequencing reads to hg19 genome.Bismark2report was used to generate WGBS quality control report to check whether there was methylation bias.Samtools Flagstat statistical comparison of quality control indicators,Picard statistical removal of repeat Reads;5.Principal component analysis was performed on the methylation matrix of all samples to observe the differences between groups.6.Ggplot2 was used to visualize PCA dimensionality reduction results,one-dimensional and two-dimensional density maps of genome-wide methylation,and to observe global DNA methylation changes by chromosome;7.MethylSeekR was used to identify partial methylated regions(PMDs),to analyze changes in methylation levels and the number and expression of internal genes in PMDs;8.DSS package was used to identify the differential methylation regions(DMRs)of ADPC and NEPC samples,and the enrichment fractions and significance P-values of these DMRs in different chromatin states of the epigenome were calculated by hypergeometric test.8.Download H3K27ac ChIP-seq data of NEPC and ADPC from GEO database,identify NEPC-specific enhancers,and analyze their relationship with DMRs;9.Transcriptome sequencing data of publicly available NEPC samples were downloaded using cbioportal and gene expression differences were analyzed;10.GREAT was used for GO enrichment analysis of the demethylated DMRs,Homer was used to identify the transcription factors that might be bound,and the epigenetic regulation mechanism of NEPC was analyzed,and the epigenetic browser was used for visualization.Results1.Compared with ADPC,NEPC showed a global wide loss of DNA methylation,and the difference was statistically significant;2.There was a positive correlation between apparent mutation load and Gleason score(R=0.69,P<0.05),the higher the apparent mutation load,the higher the degree of malignancy;3.PMDs of NEPC accounted for 21.7%to 53.1%of the genome,and PMDs had lower methylation levels than ADPC(P<0.05);The gene expression level in PMDs was significantly lower than that in extrappmDs,and the local CpG islands in PMDs were significantly hypermethylated compared with the CpG islands outside PMDs.This large-scale loss of methylation in PMDs accompanied by increased methylation in local CpG islands may be one of the characteristics of prostate tumors.4.The local demethylation region of NEPC was significantly higher than that of hypermethylation region,and the number of demethylation region accounted for more than 98%of all DMRs,and was mainly enriched in enhancer related state,indicating that the methylation anomaly of enhancer may promote the occurrence of NEPC.5.A total of 20,216 NEPC specific enhancers and 7,095 ADPC specific enhancers were identified,and the demethylation region was enriched in NEPC specific enhancer regions.6.Demethylation of specific enhancers may bind to key transcription factors such as FOXA2,FOXA1,ASCL1,NKX2-2,SOX2 and NEUROD1 to drive the development of NEPC.The epigenomic status of the key transcription factor ASCL1 and NE phenotypic marker CHGA was completely changed from inhibitory state to active state,and its upstream enhancer was demethylated and enriched in H3K27ac activation signal and FOXA1 binding.SummaryThe genome-wide DNA methylation maps of NEPC and ADPC were delineated with single-base resolution,and the global demethylation characteristics of NEPC and ADPC were significantly different.In PMD regions and transposon regions,NEPC showed progressive loss of DNA methylation compared with ADPC.These differential methylation regions are enriched in NEPC-specific enhancers and may bind to key transcription factors such as FOXA2,FOXA1,ASCL1,NKX2-2,SOX2 and NEUROD1,suggesting that DNA methylation plays an important and unique role in NEPC and is involved in the induction of NEPC phenotype.Section Ⅳ Single-cell transcriptome sequencing to explore intra-tumoral heterogeneity and epigenetic regulatory mechanisms in neuroendocrine prostate cancerObjectivesSingle-cell transcriptome sequencing was performed in patients with NEPC to explore the heterogeneity of NEPC according to different gene transcriptome profiles,further search for evidence of the possible origin of NEPC,and combine the transcriptional pedigree characteristics of NEPC with the epigenetic regulation mechanism to explore the epigenetic regulation mechanism of neuroendocrine prostate cancer.Methods1.ScRNA sequencing of 10×Genomics was performed on 4 patients with clinically highly suspicious NEPC,and scRNA sequencing data of 3 NEPC patients in GEO were downloaded;2.Use CellRanger4.0 software to run sequencing data and obtain Seurat input files.Use Seurat to conduct quality control on sequencing data,evaluate UMI count of each cell,number of detected genes,proportion of UMI to detected genes and percentage of mitochondrial genes,and filter low-quality cells;3.UMAP was used to reduce the dimensionality of the standardized single cell RNA expression matrix;4.Using EPCAM-/VIM+normal cells as a reference,inferCNV was used to analyze copy number variation of a single cell,and malignant cell populations were identified by combining copy number variation and EPCAM+/VIM-characteristics.5.Specific marker genes associated with normal cell types was used to annotate the cell types of each subpopulation;6.Non-negative matrix decomposition(NMF)was used to explore different gene expression modules in malignant tumors.GO enrichment analysis was performed on different gene sets,and NE cell subsets were searched.7.DEseq2 was used to analyze the expression profile characteristics of different NEPC cell subsets,identify the differential genes,and analyze the intratumoral heterogeneity of NEPC;8.Immunofluorescence staining was performed to verify the analysis results of scRNA and observe whether different NEPC subgroups identified could be found;9.According to the copy number variation results of inferCNV,whether NEPC and ADPC came from the same ancestral clone was analyzed,and searching for evidence of the origin of NEPC;10.Using pySCENIC,the transcriptional pedigree characteristics of NEPC are combined with epigenetic regulatory mechanisms to explore possible transcription factors driving different subsets.Results1.A total of 36,036 high-quality cells were included and analyzed,with an average of 2,490 genes and 11,300 transcripts detected per cell,meeting the requirements of subsequent analysis;2.All cells were reduced by UMAP,and 21,19,22,24,21,16 and 11 cell subsets were generated from NECP 1-7 samples,respectively.A total of 19,605 malignant cells were identified based on copy number variation of inferCNV and EPCAM+/VIM-markers.83.76%-98.57%of single nucleotide variations(SNA)and small insertion deletions(Indel)detected from WES pairs were verified by these malignant cells,indicating the accuracy of malignant cell identification.3.The dimensionality reduction results of NMF were clustered and two subgroups NE1 and NE2 with different NE characteristics were found.NE1 mainly expressed transcription factors related to transdifferentiation such as ASCL1,GO mainly enriched items related to fate differentiation,and NE2 mainly expressed NE-related phenotypic markers such as CHGA.GO mainly concentrates NE function-related items.NE1 may represent the early stage of NEPC differentiation,while NE2 represents the relatively late stage.4.NE1 and NE2 subgroups were found in NEPC samples by immunofluorescence staining,which verified the accuracy of scRNA analysis and clarified the intra-tumoral heterogeneity of NEPC.5.Taking NEPC3 as an example,the CNV variation shared between malignant cells,combined with the UAMP mapping results of single cells,showed that the tumor cells of NEPC and ADPC had the same genomic background,suggesting that NEPC and ADPC were derived from the same progenitor clone.6.PySCENIC has been used to identify potentially diverse regulatory networks of NE1 and NE2 transcription factors,including NE1-specific NKX2-2,NE2-specific SOX2 and POU3F2,and FOXA2 and ASCL1 shared by two subpopulations.These results are consistent with the methylation results in Section III.The apparent driving relationship between these transcription factors,enhancers and DNA methylation and the complex synergistic mechanism in the process of NEPC transdifferentiation were explained.SummaryIn this part of the study,we constructed single-cell gene expression profiles of 7 patients with NEPC,identified two NE tumor subpopulations with different characteristics,representing different stages of NEPC transdifferentiation,and explored multiple subtype-related transcription factors that may promote NEPC differentiation and lineage remodeling.In addition,the copy number variation results of inferCNV supported the view that NEPC originated from ADPC. | Keywords/Search Tags: | Histone modification, methylation, chromatin accessibility, ChIP-seq, WGBS, ATAC-seq, RNA-Seq, Prostate, Epigenome, Enhancer, ChromHMM, GWAS, Methylation, NEPC, Transcription factor, H3K27ac, Single-cell sequencing, Tumor heterogeneity, InferCNV, NMF | PDF Full Text Request | Related items |
| |
|