Font Size: a A A

Comparative Genomics Characterization Analysis Of Plant Introns And Its Application In Nicotiana

Posted on:2022-09-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y GaoFull Text:PDF
GTID:2480306311462044Subject:Tobacco science
Abstract/Summary:PDF Full Text Request
Introns are non-coding sequences of interrupted genes.In model species,more and more studies had shown that they were involved in important biological processes,such as transcription regulation,alternative splicing,and nuclear export.However,the analysis of the characteristics of introns in tobacco needs further study.With the rapid development of sequencing technology and the reduction of sequencing costs,a large amount of botanical omics information has provided a data basis for the establishment of a comprehensive plant intron analysis platform.Therefore,in this study,the methods of bioinformatics and comparative genomes were used to extract intron information from sequenced plant genome annotation files and whole genome sequence files,and to analyze and compare the characteristics of introns between different families and Nicotiana.And then a comprehensive plant intron database was builted.It provides a data platform for further study of intron length distribution,GC content and splicing characteristics,and also provides a data basis and reference value for intron marker development and molecular breeding in tobacco and other plants.The main results of this study are as follows:(1)Intron information analysis in plants.The detailed information of 121 plant genomes was collected and summarized as a form of plant genome resources,which was stored in the download interface of the plant intron database.Based on the Perl script,a large-scale acquisition of 6,402,628 transcripts,25,250,021 introns from 121 species in 45 families,with an average of 3.94 introns and 4.94 exons per transcript.(2)Analysis of intron length and GC content in different families.Among all introns,the number of introns in the range of 51-100 bp accounted for the largest proportion,accounting for29.68%.Cruciferae has an average intron length of less than 300 bp,which was relatively short compared to other families.The average intron length increased with the increase of GC content,and when reached the maximum,the average intron length decreased with the increase of GC content.The intron length of Cruciferae,Solanaceae,Rosaceae,Legume,and Rutaceae was maximum when the average intron GC content was between 30% and 40%.In Gramineae,the average intron length was the largest when the average intron GC content was about 40%.(3)Analysis of intron length and GC content in Nicotiana.In Nicotiana,the number of introns was the largest in the range of 51-100 bp,and the number of introns with the length over1,000 bp accounts for about 10%,especially for Nicotiana tabacum K326,which was nearly20%,and there was a certain proportion of long introns.The average intron length reached the maximum when the average GC content of 8 tobacco was nearly 40%,and then the average intron length decreased with the increase of GC content.(4)Conservation of 5? splicing site and 3? end splicing site.236 sequence maps containing5' splicing sites and 3' splicing sites were obtained in this study.There were highly conserved sequences upstream and downstream of intron-exon splicing sites,and conserved dinucleotide GT at the 5' splicing site of intron and conserved dinucleotide AG at the 3' splicing site of intron,strictly following the GT-AG rule.(5)A total of 379,309 simple sequence repeats were detected in intron sequences in Nicotiana,among which monomer was the most abundant,accounting for 59.23%.The sum proportion of monomer,dimer and trimer simple repeats in introns of 8 tobacco species was basically the same.(6)Construction of plant intron database(PID).A publicly available searchable database,was developed to efficiently store,query,analyze,and integrate intron resources in plants,and take tobacco as an example to show how to use the database.PID contains 25,250,021 introns,31,652,649 exons,and 414 visual maps.Users cannot only view intron length distribution chart and 5? and 3? splice site sequence feature maps in a statistical interface but also can browse and download the location information and sequences of introns,exons and genes in a graphical visualization interface through JBrowse.Viro BLAST for sequence homology searches,intron detection and sequence interception tools were also provided.This tool will greatly accelerate research on the distribution,length characteristics,and functions of introns in plants.PID is accessible at http://biodb.sdau.edu.cn/PID/index.php.
Keywords/Search Tags:plant, intron, GC content, database, tobacco
PDF Full Text Request
Related items