Font Size: a A A

Construction And Analysis Of The Databases For Essential Genes And Replication Origins

Posted on:2017-05-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:H LuoFull Text:PDF
GTID:1310330512480272Subject:Biophysics
Abstract/Summary:PDF Full Text Request
The development of DNA sequencing technologies has led to the explosive growth of the biological data.Mining the valuable information and model from the massive data is becoming the major issue in bioinformatics research.The essential genes and replication origins are all genomic elements which are absolutely required for cell survival.This paper mainly concentrates on the data collection and analysis of essential genes and replication origins,and describes the related database with bioinformatics analysis and application.The first part of this paper is launched around the essential genes.First,we introduce the new release of the database of essential gene(DEG).With the comparison of the previous versions,DEG has made a great increase in the number of prokaryotic and eukaryotic essential genes determined by genome-wide gene essentiality screens.And the essential noncoding genomic elements,such as noncoding RNAs,promoters,regulatory sequences and replication origins,are also collected to the database.We also develope customizable BLAST tools that allow users to perform species-and experiment-specific BLAST searches for a single gene,a list of genes,annotated or unannotated genomes.Then,we show that the percentage of essential genes in bacterial genomes presents an exponential decay with increasing genome sizes,and perform the gene ontology(GO)enrichment analysis with the data in DEG.Finally,we perform a comprehensive comparison between the essential and nonessential genes in 23 species of bacterial genomes based on the Ka/Ks ratio,and find that essential genes are more evolutionarily conserved than nonessential genes in most of the bacteria examined.Furthermore,we also analyse the conservation by functional clusters with the clusters of orthologous groups(COGs),and find that the essential genes in the functional categories of G(Carbohydrate transport and metabolism),H(Coenzyme transport and metabolism),I(Transcription),J(Translation,ribosomal structure and biogenesis),K(Lipid transport and metabolism)and L(Replication,recombination and repair)tend to be more evolutionarily conserved than the nonessential genes in bacteria.In the second part of this paper,we analyse the characteristic in DNA replication origins.We first collect large amount of replication origins(oriCs)from a wide variety of organisms,and construct two databases of replication origins in both prokaryotic and eukaryotic organisms,named DoriC and DeOri,respectively.DoriC has significant advances over the number of bacterial genomes.Additionally,oriC regions in archaeal genomes identified by in vivo experiments as well as in silico analyses have also been added to the database.The Database of Eukaryotic replication origins(DeOri),which contains the eukaryotic ones identified by genome-wide analyses currently available,is the first database of eukaryotic replication origins.Finally,we introduce the first web-based tool Ori-Finder 2 to predict the oriCs in the archaeal genome.The software presented here is accurate for the genomes with single oriC,but for the genomes with multiple oriCs,it does not necessarily find all origins of replication due to the limitations of the method.
Keywords/Search Tags:Essential genes, Replication origins, Public biological database, Archaeal genome, Evolutionary conservation
PDF Full Text Request
Related items