Font Size: a A A

Development Of Bioinformatics Tools For Time-course Data Analysis And GSA Methods

Posted on:2022-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:X W HuangFull Text:PDF
GTID:2480306542495674Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
Objective(1)Create a platform using neo4j and shiny which allows the study of gene and gene-sets networks along time.We tried to take the COPD gene expression data set as an example to create a shiny app for researchers to quickly check the time changes of gene co-expression.It is hoped that key genes and gene co-expression regulation modes in the occurrence and development of diseases can be found.(2)Build a platform that wraps different R GSA tools into the Galaxy environment,which is easier to use and reproducible.Motivation(1)Disease research on time is often complicated and difficult to have a systematic and comprehensive method.The analysis of the temporal change of gene co-expression should help in understanding the complex mechanisms of gene regulation.(2)GSA has a lot of methods,which make researchers confused to choose the correct method for their data.Also,some GSA methods are widely available but only are limited to those with programming experience because that do not have a user graphical interface to facilitate the use of researchers.In order to avoid this situation,we developed the corresponding Galaxy GSA tool based on Galaxy and according to the source code of GSA tool,so that everyone can use it easily.Method(1)We use WGCNA method to construct a gene co-expression network on COPD time series data set GSE108134,and associate genes with KEGG pathway and GO terms.Then the gene co-expression network was imported into the neo4j database.Finally,we use the shiny package to create a visual shiny app,and use the app to analyze the gene co-expression network of COPD over time.The shiny app has three tab: First tab is Genes Relationships in KEGG Pathway/GO Term.Second tab is Genes Neighborhoods Relationships.Third tab is Alluvial Diagram.(2)First,install galaxy.Second,select tools that work well in Galaxy.Third,create wrappers for each tool,formed by one R script that runs the package and one xml file that defines the interface and links the R script to galaxy.Forth,create three different types of distributions: Individual tools in the galaxy tool shed,virtual machine,and website.Result(1)Taking the COPD data set as an example,we have created a SHINY app using the WGCNA method,which can view the gene co-expression of COPD under different conditions and associate KEGG and GO.The dynamic maps of gene-KEGG,gene-GO and gene-gene under different conditions can be drawn quickly.Through the analysis of COPD data sets,we found that IL1? was co-expressed with CCR1,ILRN,PLAUR,PROK2,CXCR2,SRGN,FPR1,S100A9 and other genes at different time points,and IL1? was associated with multiple KEGG pathways and GO terms and played a key role in gene co-expression of COPD.Unlike healthy non-smokers and healthy smokers,the gene co-expression network of COPD smokers concentrates on biological processes such as the G-protein-coupled receptor signaling pathway,inflammatory response,and signal transduction.(2)We have created 7 Galaxy-GSA tools that adapt to different conditions.These tools can use the graphical interface for gene set analysis on the Galaxy platform.For example,Reactome PA uses a hypergeometric model with a set of Entrez genes for Gene Set Analysis for the Reactome pathway.GSVA calculates the specific enrichment statistics of a single sample to generate a pathway-sample matrix.Chip Enrich and Methyl GSA are the GSA methods specifically for Chi P-Seq and methylation data.We have tested the Reactome PA,SPIA,GSVA,chipenrich,methyl GSA,GSAR and mogsa tools,and we released these tools in the Galaxy tool shed to facilitate installation by Galaxy administrators and provide convenience for more researchers doing GSA.However,we built Galaxy-GSA platform collected all above GSA methods.Conclusion(1)The analysis case of the COPD data set shows that our shiny app is feasible in analyzing gene co-expression networks.Based on this,we will develop a platform that allows users to perform in-depth analysis of gene co-expression by simply inputting gene expression data.(2)There are many types of GSA methods and each has advantages and disadvantages.For some GSA methods that have been widely used but are not friendly to researchers who do not know how to program,we have created the Galaxy-GSA series of tools and released the Galaxy GSA platform to provide a graphical user interface for more people to choose a suitable GSA method for their data.
Keywords/Search Tags:Gene Set Analysis, COPD, WGCNA, Galaxy, Gene regulatory network
PDF Full Text Request
Related items