Font Size: a A A

Analysis On The Variation Pattern And Evolution Of Insect Genome Size

Posted on:2024-07-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y Y CongFull Text:PDF
GTID:1520307301479034Subject:Agricultural Entomology and Pest Control
Abstract/Summary:PDF Full Text Request
Genome size(GS)is defined as the total amount of DNA contained within a haploid chromosome set of an organism and is one of the fundamental properties of species.In eukaryotes,the range of GS variation is extremely wide and weakly correlated with biological complexity.Insecta is one of the most species-prolific animal groups and thus serves as an ideal model for studying GS evolution.However,less is understood about GS evolution in Insecta.Here,the GSs of 21 insect species were estimated,and 1,326 insect GSs were collected from the literature and publicly available databases.Then,the patterns of insect GS variation were analyzed,as well as the relationship between GS variation and transposons.The main findings are as follows:1、The discrepancy between insect GSs estimated by flow cytometry and genome assemblingGS estimation is an important prerequisite for genome sequencing assemblies.However,there are large discrepancies between insect GS from genome assembly and that estimated by flow cytometry and K-mer analysis.To clarify this discrepancy,the GS of 21 agricultural insects from six orders and 11 families was first estimated by flow cytometry,while 1,345 insect GS information assessed by flow cytometry was collected from the Animal Genome Size Database.The genome assembly information of 546 insects was obtained from 14 genetic information data sites including NCBI,Giga DB,DDBJ,i5 k workspace@NAL,and Insect Base.After removing redundancy,a total of202 insect species had both two kinds of GS information from flow cytometry-based estimation and genome assembly.The assembled GS showed a significant positive correlation with the flow cytometry-estimated results(p < 0.001,Wilcoxon rank-sum test),indicating that the GSs estimated by both methods were relatively consistent.Flow cytometry-estimated GS and assembled results showed > 10% differences between 140 insects,with 98 insects showing smaller assembled results,especially in Diptera and Hymenoptera.The genome of insects with larger assembled GS had significantly more repetitive sequences(p < 0.05,Wilcoxon rank-sum test),but there was no significant correlation with GC content,contig N50,and genome sequencing platform and assembly strategy.In most cases,insect genome assembled GS were smaller,indicating that the genome assembly was not complete.However,in the case of a higher percentage of repetitive sequences,the genome assembly appeared redundant,resulting in a larger assembled GS.2、The evolution patterns of insect GS variationInsect GSs estimated by flow cytometry were collected by literature mining and public databases.After removing the redundancy,GS information of 1,326 insect species from 700 genera of 155 families in 21 orders was collected(until December2020).The analysis revealed that insect GSs ranged from 68.46 Mb(Clunio tsushimensis in Diptera)to 18.23 Gb(Bryodemella holdereri in Orthoptera),with a 266-fold variation.Among them,the average genome size of Diptera was the smallest(381.44 Mb,p < 0.001,Wilcoxon rank-sum test),and the largest was in Orthoptera(0.91 Gb,p < 0.001,Wilcoxon rank-sum test).The average GS of Holometabola insect(Endopterygota)(536.7 Mb)was significantly smaller than that of hemimetabolous insects(2,781.4 Mb,p < 0.001,Wilcoxon rank-sum test)and ametabolous insects(3,003.5 Mb,p < 0.001,Wilcoxon rank-sum test).Its GS variation was smaller(41.7-fold)than that of hemimetabolous insects(169.5-fold).Insects within the same genus had similar GS values,with 85% of insects in the genus(159/187)having < 1-fold differences in GS,while insects in the same family generally had greater differences,with 60% of insect families(57/96)having > 1-fold differences in GS.Insects within the family and genus Coleoptera had greater differences in GS compared to other orders,which may be related to the diversity of their species numbers and morphological characteristics.Among 1,326 insect species with known GS,1,256 insects from 667 genera of 154 families in 21 orders have species information in the NCBI Taxonomy database,which was used to construct a species tree of these insects.The macroevolutionary patterns of insect GS were then analyzed using phylogenetic comparative methods.The results showed that there was a significant phylogenetic signal for GS in all insect taxa,indicating that insect GS changes were closely related to the phylogenetic relationships.The evolutionary models of GS of different insect taxa were estimated.The model of GS evolution of Insecta was best fitted to the Ornstein-Uhlenbeck model(OU model),indicating that,on the whole,the GS changes of Insecta showed an adaptive evolutionary process.Among them,the holometabolous insects were also best fitted to the OU model,while hemimetabolous insects were best fitted to the Brownian motion model of neutral evolution.This result suggested that the metamorphosis may have a binding effect on the evolution of insect GS.The results of ancestral state reconstruction showed that the ancestral GS of Insecta was ~1 Gb,and the GS of extant insects showed a large number of contractions and expansions.Compared to the ancestral GS of Insecta,76.5% of insect species showed > 10% genome contraction,indicating that genome contraction is the main trend of insect GS evolution.GS change showed two opposite directions of expansion and contraction at the ancestral node(853.1 Mb)of the Neoptera insects.The Polyneoptera insects(ancestral node GS of 2,666.9 Mb)showed a trend of expansion,while the Paraneoptera and Endopterygota(ancestral node GS of 511.68 Mb and 437.52 Mb,respectively)showed a contraction trend.The ancestral node GSs of Orthoptera and Hymenoptera were 4,764.21 Mb and 331.13 Mb,respectively,and extant insects in these orders generally have larger genomes,indicating that they have undergone significant genome expansion.3、Genomic analysis of intrafamily GS differences among closely related insectsTo investigate the reasons for GS differences among closely related species,insects with > 2-fold GS differences within families and high-quality genome assembly(BUSCO Complete gene % > 80% and scaffold N50 > 5 Kb)were selected for in-depth genome analysis.A total of 56 insect species from four orders and nine families were included,namely,the Chrysomelidae(4 species),Coccinellidae(4 species),Curculionidae(4 species),Lampyridae(5 species),Scarabaeidae(6 species),Staphylinidae(3 species),Braconidae(14 species),Nymphalidae(11 species),and Chironomidae(5 species).The analysis showed that there were no whole-genome duplication events in these insects,indicating that this was not the reason for the ploidy-level GS differences between closely related insects within the family.A comparative analysis of the content of different genomic regions such as exons,introns,non-coding regions,and repetitive sequences in insect genomes showed that the length of exonic regions in insect genomes within each family did not vary significantly(18.06 Mb ~ 46.09 Mb).However,insect GS was significantly and positively correlated with the content of transposons(corrected Pearson correlation coefficient > 0.9,p < 0.001).The variation in transposon content was the main source of GS variation in closely related insects within other families,with an exception of the Chironomidae.The genomes of the Chironomidae are small,with transposon content ranging from only 2.5% in Clunio marinus to 8.5%in Chironomus tentans.Thus,transposons do not constitute a major part of these insect genomes,while the variations of non-coding regions such as introns and intergenic regions are the main reason of GS variation of these insect species.The historical dynamics of transposon changes in the genomes of 26 species of Coleoptera,14 species of Hymenoptera,and 11 species of Lepidoptera showed that the transposons of larger genomes within the same family showed an "L-shaped" pattern of recent activity(Kimura substitution level < 10%)or a "bimodal" pattern of continuous accumulation of activity.The number of transposon families in insects with larger genomes within the family is much larger,yet only a few of these transposon families are dominated by the specific expansion of transposon sequences.For example,transposons of the Tc1-Mariner,Penelope,Jockey,and RTE transposon superfamilies in Coleoptera contributed the most to GS differences among species,while SINE/t RNA retrotransposons contributed the most to GS differences in genus Heliconius,family Nymphalidae,and these transposon families also showed a trend of increased "recent" transposon activity.4、GS variation of three rice planthoppersThe brown planthopper Nilaparvata lugens(BPH),the white-backed planthopper Sogatella furcifera(WBPH),and the small brown planthopper Laodelphax striatellus(SBPH)of the family Delphacidae,order Hemiptera,are important rice insect pests with a wide GS variation.The GS of the common ancestor of these three species was794.33 Mb,indicating that the genome of BPH showed a significant expansion,while WBPH and the SBPH showed a certain degree of contraction.A new pipeline of transposon annotation was integrated,which was used to annotate transposons of three planthoppers genomes.Among them,the numbers of transposon copies in the BPH and WBPH genomes were much higher,with contents accounting for 66.8% and 54.68% of the genomes,respectively,and only 29.39% in SBPH.The DNA transposons were highest in the genomes of the three planthoppers,and the BPH contained a higher percentage of LTR and LINE retrotransposons in their genomes than the other two planthoppers.The highest content of autosomal and sex chromosome transposons(>50%)was found in the three planthoppers,and the BPH also had a higher content of transposons(>70%)in autosomes 1 and 10.The longer average length of single copy transposon sequences(> 300 bp)in BPH compared to white back planthoppers and gray planthoppers(< 200 bp)may be associated with unequal rates of transposon sequence insertion and deletion.At the level of the superfamily,three DNA transposon superfamilies,h AT,Tc1-Mariner,and CACTA,and two retrotransposon superfamilies,Gypsy and I,were most abundant and were distributed in the chromosomes of all three planthoppers.The variation in the content of these transposon superfamilies partly explains the GS differences of these planthoppers.Estimation of transposon divergence times revealed that the historical dynamics of transposon sequences in the SBPH and WBPH genomes were more similar,with the timing of the "burst" of transposon activity(40 ~ 50 Mya)coinciding with the divergence of the two species(43.5 Mya).The presence of multiple "ancient" transposon sequences in the BPH genome,dating back to 175 Mya,could be a trace of the preserved transposon sequence of the ancestor of the superfamily Fulgoroidea(179.8 Mya).The transposon sequences in the WBPH and SBPH are overall "younger" than in the BPH,with the oldest transposon dating back to ~130 Mya.This suggests that the more "ancient" ancestral transposon sequences have been removed from the genomes of both WBPH and SBPH and are responsible for the shrinking trend in GS in both species.At the superfamily level,the frequent insertions of the Tc1-Mariner and h AT transposon superfamilies and the Gypsy retrotransposon superfamily in the recent evolutionary period(0 ~ 25 Mya)are the main reasons for the expansion of the BPH GS.In the analysis of horizontal transposons transfer(HTT)events,13 significant HTTs were found between transposons of the three planthoppers and 22 insects from 8orders as well as 4 HTTs with species other than insects,with BPH involving the most HTTs(10 cases).The results also revealed that the ISWpi1 of Wolbachia was horizontally transferred to the SBPH genome and assimilated as one of the transposons of the Tc1-Mariner superfamily in the SBPH genome.These results suggest that HTTs can be the source of the planthoppers’ transposons.However,since transposons involved in HTTs represent only a very small fraction(< 0.05%)of the three planthoppers,their effect on the GS changes is minimal.In sum,this thesis analyzes the discrepancy of the assembled GS and flow cytometry estimation,the macroevolutionary pattern of insect GS,the genome content differences between closely related insects with varied GS,and the evolutionary relationship between transposons and GS.This study lays the foundation for a better understanding of the evolution of insect GS,and also provides an important reference for GS evolution analysis of other biological taxa.
Keywords/Search Tags:insect, genome, transposon, genome size, variation pattern, evolution
PDF Full Text Request
Related items