Font Size: a A A

Comparative Analysis Of SSRs Based On Whole Genome Sequences Of 140 Plants And Preliminary Study Of Potential Functions Of Long SSRs

Posted on:2021-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhuFull Text:PDF
GTID:2370330602971733Subject:Ecology
Abstract/Summary:PDF Full Text Request
Simple sequence repeats(SSRs)are abundant in eukaryotic and prokaryotic genomes.Recent advances in genetic analysis and genotyping methods have rapidly expanded the ability to use molecular markers to solve ecological problems.SSR molecular markers have become the most popular and widely used markers in ecological applications.Moreover,SSRs also play an important role in the growth and development and adaptive evolution of organisms.Based on these characteristics of SSRs and combined with the currently published plant genome data,we studied the genome SSRs of 140 plants sequenced by bioinformatics methods such as comparative genomics.The results are as follows.1.Plant genome characteristics and the relationship with SSRs characteristics and genome characteristicsGymnosperms have the largest genome of the currently sequenced species(average 16,529Mbp),and algae have the largest(167 times the size of the smallest).The genome of ferns is small,the genome of monocots in angiosperms is larger than that of dicots,and different families in dicots is different(the size is mean comparison).A total of 283,867,588 SSRs were identified using the MISA program.The results showed that the number of SSRs was strongly positively correlated with the genome size,while the density of SSRs was weakly negatively correlated with the genome size.The difference of SSRs density in alga was large,but that in angiosperms was small.2.SSRs distribution under different plant classificationsHexa-nucleotide is the most abundant SSRs type in plant genome,and deca-nucleotide-is the least distributed type.The SSRs type distribution of algae showed different preferences from other types of plants.The SSRs type next to the hexa-nucleotide in algae is trinucleotide,while the other groups are hepta-nucleotide.And the type in third place not only varies between taxa but also between families.We found a positive correlation between GC content in SSRs and GC content in genome,except for a few plants.The GC content of algae has the biggest difference and the average GC content was the highest,followed by Poaceae of monocots.The GC content of mosses,ferns and dicots was relatively low,and the average GC content of the smallest Fabaceae(dicots)was only 31.13%.By analyzing the relationship between GC content and SSRs motif,it is shown that the preference of SSRs motif is influenced by GC content.In general,the higher the GC content,the more G/C-rich motifs in SSRs.Especially in algae and gramineous plants.The type of motif in A/T-rich were more than that in G/C-rich,leading to the influence of GC content of algae and Poaceae on motifs not dividing by 50%.When the GC content of their genomes reached a point(less than 50%),the G/C-rich motifs in Top 10 increased significantly.3.Distribution and potential function analysis of long SSRs in plantsStatistical analysis of the distribution of extremely long SSRs in the genome(length ? 1000bp)showed that dicots have much more long SSRs.In type of SSRs,the number of dinucleotide and tri-nucleotide repeats was higher than that of hexa-nucleotide repeats.In addition,di-nucleotide repeat AG/AT/AC(including its reverse motif GA/TA/CA)and trinucleotide repeat TTA/TTC(including its deformed motif TAT/ATT/TCT,etc.)are the most numerous motif types.Analysis of the long SSRs(lengths ? 500bp)in the CDS revealed that they were rarer than in the genome.The analysis of the protein sequence with long SSRs found that the presence of long SSRs did not affect the function of the protein,but the function of the SSRs itself needs to be further analyzed.The potential functions and effects of long SSRs were designed and verified in Gossypium hirsutum and Solanum tuberosum.Long SSRs in CDS affect the structure of proteins,and the length of non-triplet motif has a greater effect on the secondary structure of proteins than that of triplet motif.This study provides useful insights into the preferences,characteristics and distribution of SSRs in plants and their potential role in evolution.
Keywords/Search Tags:Plants, Evolution, Comparative genomes, Simple sequence repeats
PDF Full Text Request
Related items