Font Size: a A A

Exploring DNA Methylation:Comparison Of Software,Database Construction And Pan-cancer Biomarker Discovery

Posted on:2024-01-31Degree:DoctorType:Dissertation
Country:ChinaCandidate:P P GuanFull Text:PDF
GTID:1520307160470734Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
DNA methylation is currently considered the most stable epigenetic modification.As research on DNA methylation has progressed,a variety of analysis tools have been developed and a large amount of data related to DNA methylation has been accumulated.However,the researcher lack information on tool selection when analyzing DNA methylation data,and the biological value of such data remains to be further explored.To address these issues,this study aims to compare the performace of different alignment tools and explore the biological value of existing DNA methylation data by collecting and reanalyzing it.The main research contents of this dissertation include: comparing the performance of 12 alignment algorithms;constructing the first allele-specific DNA methylation(ASM)database;mining pan-cancer biomarkers through data mining.The details are as follows:(1)Due to the wide varity of alignment algorithms available and their varying performance,researchers may struggle to select appropriate analysis tools.To address this issue,we compared the performance of 12 alignment algorithms for DNA methylation data analysis,with a particular focus on single-cell DNA methylation data.Through the evaluation of 12 algorithms,the results show that Abismal,Bismark_bowtie2_e2e,Bismark_bowtie2_local,bwameth,BSMAP and Hisat-3n have high unique mapped rate and accuracy.After taking into account running time,memory consumption,unique mapping rate and mapping accuracy,this study suggests that Abismal has better performance on single-cell bisulfite sequencing.Additionally,we observed that reads with a lower percentage of soft-clips had higer mapping accuracy.We also noted significant differences in single-cell methylation level when using different alignment algorithms.Our work provides valuable information to researchers when selecting alignment algorithms,and can effectively improve the efficiency and accuracy of detecting DNA methylation levels.(2)Until recently,the field of allele-specific DNA methylation databases was largely unexplored.In response to this gap,our study constructed the first comprehensive allele-specific DNA methylation database(ASMdb).To establish this database,we collected 4400 BS-Seq datasets and 1598 corresponding RNA-Seq datasets from 47 species,including human and mouse.We then analyzed the data to obtain DNA methylation levels,allele-specific DNA methylation and allele-specific expressed genes(ASEGs),and further demonstrated the distribution patterns of ASM/ASEG in these species.In-depth ASM distribution analysis and differential methylation analysis in nine cancer types showed that the results obtained in this study were conform to previously reported changes of ASM in key tumor genes.Moreover,we identified several potential ASM-related tumore genes.Finally,integrating these results,we constructed the first resourced rich and comprehensive ASM database(ASMdb)for 47 species,which is a valueable resource for researchers studying DNA methylation.(3)Given the limited research on DNA methylation tumor markers in pan-cancer,our study aimed to identify such markers using 858 whole-genome bisulfite sequencing data.After analyzing changes in methylation across different cancer types,we identified nine candidate tumor marker genes.Two of these genes were found to form reliable marker combinations using a binary regression model,which was verified via the TCGA dataset.Furthermore,we demonstrated that the combination of these two genes was effective in predicting cancer samples at different stages.Using wholegenome bisulfite sequencing data,we also identified three sties in different cancer types that could not be detected by 450 K chip.A binary logistic regression model was established for them,and found that they counld accurately recognize tumor samples(AUC=0.989).In conclusion,our study shows the great potential of DNA methylation as a tumor marker,and the methylation sites identified in our research provide new resources for tumor markers of pan-cancer.To sum up,our study has compared 12 DNA methylation alignment algorithms,providing valuable reference information for researchers when selecting tools.In addition,the construction of allele-specific DNA methylation database fills the gap that no one has made comprehensive collation and analysis of ASM.Using the methylation data in the ASMdb database,we were able to identify pan-cancer tumor markers.Together,these efforts have provided important assistance to the study of DNA methylation and deepened our understanding of the role of its role in cancer.
Keywords/Search Tags:DNA methylation, alignment algorithm, allele-specific DNA methylation, pan-cancer, tumor marker
PDF Full Text Request
Related items