Genomic Mining And In Silico Analysis Of Thiopeptide Gene Cluster And Non-Coding RNA Of Bacteria

Posted on:2014-01-15

Degree:Master

Type:Thesis

Country:China

Candidate:J Li

Full Text:PDF

GTID:2230330392961174

Subject:Biology

Abstract/Summary:

PDF Full Text Request

The DNA sequencing technology is dramatically advancing, whichhas measured the complete genome sequences of nearly2000prokaryotes,and especially the widely used next-generation sequencing (NGS)technology is also promoting bacterial transcriptomic studies. However,the growing massive-scale omics data call urgently for rapid and deepmining for phenotype characterization. In this study, we discussed thebioinformatic strategies of biological big data mining with focus onthiopeptide gene cluster and non-coding RNA (ncRNA) of bacteria.Thiopeptides are a growing class of sulfur-rich, highly modifiedheterocyclic peptides that are mainly active against Gram-positive bacteriaincluding various drug-resistant pathogens. Recent studies also reveal thatmany thiopeptides inhibit the proliferation of human cancer cells, furtherexpanding their application potentials for clinical use. Thiopeptidebiosynthesis shares a common paradigm, featuring a ribosomallysynthesized precursor peptide and conserved posttranslationalmodifications, to afford a characteristic core system, but differs in tailoringto furnish individual members. In this study, we have developed a web-based tool ThioFinder to rapidly identify thiopeptide biosynthetic genecluster and the cleavage sites of precursor peptides from DNA sequenceusing a profile Hidden Markov Model approach. Fifty-four new putative thiopeptide biosynthetic gene clusters were found in the sequencedbacterial genomes of previously unknown producing microorganisms.Identification of new thiopeptide gene clusters, by taking advantage ofincreasing information of DNA sequences from bacteria, may facilitatenew thiopeptide discovery and enrichment of the unique biosyntheticelements to produce novel drug leads by applying the principle ofcombinatorial biosynthesis.A ncRNA is a functional RNA molecule that is not translated into aprotein. It plays important regulatory roles in a variety of cellular processes,such as bacterial pathogenesis and drug resistance. The utilization of RNA-Seq technology in transcriptomics has allowed a high-throughputexplosion of bacterial ncRNA. The second part of this study first collected1490ncRNAs in17bacterial strains by mining the reported RNA-Seq dataï¼Œand then expanded the dataset by878published experimentally verifiedncRNAs. With the obtained big dataset, we identified the conservedsequneces of the promoter and terminator regions of ncRNA genes, whichare found to be closely related to the G+C content of bacterial genomes.The refined sequence features may facilitate prediction of ncRNA genes inbacterial nucleotide sequences.

Keywords/Search Tags:

thiopeptide biosynthesis, bacterial ncRNA, bacterialgenomics, bioinformatics webserver, molecular biology database

PDF Full Text Request

Related items

1	Thiopeptide Antibiotic Cyclothiazomycin: Identification And Analysis Of Its Biosynthetic Gene Cluster In Streptomyces Hygroscopicus 10-22
2	Discovery Of Ribozyme And New Structured NcRNA By Bioinformatics
3	Ecological Evolution And Biodiversity Database Of The Global Smilacaceae
4	Development Of Reproductive Biology Related Bioinformatics Tools And Databases
5	Discovery,Functional Analysis And Database Construction Of Human And Invertebrate Structural Non-coding RNA
6	Scientific Innovation Characteristics Of The Molecular Biology Research
7	Natural Products Heterologous Biosynthesis Based On Systems-synthetic Biology
8	Bioinformatics Analyses Of Protein Ubiquitination Sites
9	Analysis Of Prognostic Key Genes And Pathways In Pancreatic Adenocarcinoma Based On Bioinformatics
10	Analyses of antibiotic biosyntheses in Streptomyces spp.: The molecular biology of nonactin biosynthesis and the novel biochemistry of daunorubicin biosynthesis