Font Size: a A A

Pipeline Software Development Of ChIP-Seq Data Analysis

Posted on:2018-06-17Degree:MasterType:Thesis
Country:ChinaCandidate:J W ManFull Text:PDF
GTID:2348330515987540Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Chromatin immunoprecipitation followed by sequencing(ChIP-Seq)technology is currently the mainstream technology of the gene expression regulation mechanism,which mainly used to study the interaction between DNA and protein.With the continuous development of bioinformatics technology,some software tools that integrate ChIP-Seq analysis,such as Cistrome,CisGenome,seqMINER and Nebula,have also been widely used.However,there are still some areas can be improved about these existing analysis tools in the user interaction,input files and output of the results.The main work of this paper is to develop a more comprehensive,more humane and functional Ch IP-Seq analysis tool by studying the general workflow of ChIP-Seq analysis.CSATK(Ch IP-Seq Analysis Toolkit).CSATK is written in Java,an advanced programming language,which also inherits the advantages of Java cross-platform,multi-thread,etc.,by calling the basic software in ChIP-Seq analysis,CSAKT can not only complete the conventional ChIP-Seq data analysis process(from quality control to Motif analysis),but also simplify problems the analyst may encounter in the analysis such as building the environment,software installation and other issues.In addition,it provides independent tools for data analysis(such as parsing FastQC quality control results,GO analysis,Pathway analysis,etc.).CSATK not only implements most of the functionality of the existing ChIP-Seq analysis software,but also incorporates unique features such as providing analysis reports in HTML form.In this paper,CSATK was used to complete 11 groups of ChIP-Seq data analysis,complete process time-consuming is 11.5 hours,a lot of time was saved compared to manual operation.At the end of this article,CSATK will also make a preliminary comparison with existing ChIP-Seq process analysis tools such as Nebula,CisGenome,Cistrome and seqMINER,and find out that CSATK is better in data alignment,result report and parallel computing.
Keywords/Search Tags:transcriptional factor, histone, Ch IP-Seq, software development
PDF Full Text Request
Related items