Font Size: a A A

Single-base Resolution Map Of Evolutionary Constraint And Annotation Of Conserved Elements Across Grass Genomes

Posted on:2018-12-24Degree:MasterType:Thesis
Country:ChinaCandidate:P P LiangFull Text:PDF
GTID:2530305123962879Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Poaceae is a big family of Monocotyledoneae that include many important cereal crops.Over the past thousands of years,humans have domesticated many cereal plants,including rice,wheat,sorghum,millet among many others.The grasses play an important role in agricultural economy.It provides many important crops not only for human energy consumption but also for livestock.A total of 60%of the world’s food source is currently derived from the grasses.At present,with plummeting sequencing cost,the full genomes of many economic crops and model organism have been sequenced.Comparative genomics is critical for downstream evolutionary studies as well as functional studies.The study and comparisons of the mammalian genomes have progressed rapidly over the past 20 years since the mammalian genomes were completed early and most of the genome structure are similar to one another.In contrast,for plant genomes,comparisons are less straightforward.In particular,noncoding sequences in plants remain under-studied thus far due to challenges associated with the genome comparisons in plants.Conserved noncoding sequences(CNSs)are DNA sequences that are evolutionarily conserved and are of interest due to their potential function of gene regulation.In cereals genome,many CNSs are known to be associated with important agronomic traits and ecological adaptations.Despite relatively mature exon prediction methods,we currently lack efficient methods to predict the exact locations of non-coding regulatory elements in plant genomes.Due to the sub-functionalization following recurring polyploidy events and massive transposon activities in the grass genomes,it is not straightforward to directly apply the mainstream algorithm used in the comparisons of mammalian genomes.Herein,we have designed a computational pipeline that is specifically tailored to the comparison of the plant genomes,in order to annotate the conserved sequences in the draft grass genomes and other non-grass monocot genomes.In this study,we used 17 grass genomes that have been published previously,along with 5 monocot genomes and a basal angiosperm genome Amborella trichopoda,for a total of 23 genomes.This is so far the largest scale pan-monocot genome comparisons that we are aware of.Genome alignments suggest that at least 12.05%of the O.sativa ssp.japonica genome appears to be evolving under selective constraints in Poaceae and nearly half of the evolutionarily constrained sequences located outside protein-coding regions.We also found the evidence for purifying selection acting on conserved sequences by analyzing the minor allele frequency of rice japonica SNPs,identifying an important link between intra-specific and inter-specific variations.Our study results show that conserved noncoding sequences(CNSs)remained conserved over millions of years of evolution.Specifically,we have found genes located downstream of CNSs are related to the regulation,suggesting that they play vital roles in plant growth and development.Furthermore,we found that certain motifs were significantly enriched with CNSs and most of them were associated with the binding activities of known transcription factors.We have curated 21,117,687 and 9,336,098 CNSs in Poaceae and Monocot separately,with the curated elements accessible through our database,web services as well as interactive genome browser tools.The functional annotation and evolutionary dynamics of the identified conserved sequences provide a solid foundation for downstream studies of gene regulation,genome evolution and further inform gene functions for cereal biologists.
Keywords/Search Tags:conserved noncoding sequences, synteny, monocot, purifying selection, comparative genomics
PDF Full Text Request
Related items