Font Size: a A A

Genome-wide Identification And Characterization Of Transposable Elements In Mulberry (Morus Notabilis)

Posted on:2015-04-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:B MaFull Text:PDF
GTID:1220330467473877Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
Transposable elements (TEs, transposons or retrotransposons) are genomic sequences that can change its positon within the genome, with functions of creating or reversing mutations and altering the cell’s genome size. TEs were firstly discovered in maize by Barbara McClintock because they have tremendous effects on genome structure and gene function. Presently, TEs have turned out to be widespread mobile elements in almost all eukaryotic organisms.Emergingevidences have been suggested that TEs influence the evolution, structure, amplification, gene creation, mutation and transcriptional regulation of genes and genomes, and the formation of microRNA.Mulberry is a perennial woody plant, which has been used for the forage of silkworm for a long time. Besides, the roles of mulberry in the ecological protection, oriental medicine resources, nutrition and health, and water and soil conservation have been identified. For example, its leaf, fruit, bark and root have much significance in Chinese traditional medicine. So far, very limited molecular studiesfor mulberry have been reported, especially for the analyses of transposable elements.The characterization of TEs is important for us to understand its function and evolution of mulberry.Completion of mulberry genome sequencing makes it possible for researchers to study transposonsat the whole genome level.In this study, the identification, classification, distribution and evolution analysis of mulberry TEs were performed by a combination of multiple methods. LTR retrotransposons occupy a substantial fraction of TEs. Meanwhile, some LTR retrotransposons could insert into the region of other retrotransposons to form nested LTR retrotransponsons. They play important roles in formation of centromemers and evolution and expansion of genome. Characterization of these TEs has also been carried out. The major results were as follow:Ⅰ, Genome-wide identification of TEs and construction of mulberry transposable elements database1, In order to make a complete and accurate identification of mulberry TEs, a combination of multiple methods were used, including De novo,signature-based and similarity-based methods. Then, all the results were further used tocompare with Repbase, Plant Repeat Database and RepeatPep.Finally, a total of5925mulberry TEs have been identified. All these TEs can be classified into13superfamilies, comparisingCopia, Gypsy, Lard, Trim, LI, RTE, hAT, CMC, PIF-Harbinger, MuLE, TcMar, MITE, and Helitron.2, Based on the80-80-80rule, all TEs can be classified into1062families, including Copia (226), Gypsy (145), Lard (312), Trim (119),L1(19),RTE (30), PIF-Harbinger (31), hAT(44), CMC (38),MuLE (39), TcMar (1),MITE(26), and Helitron (32).3, To provide an efficient and user-friendly way to access these data, an easy-to-use web based database, MnTEdb, was built using Linux, Apache, MySQL, and Perl/PHP. Users can not only search and download all the interest information, but also perform analyses using the tools provided in the database, including BLAST, GetORF, HMMER, Sequence Extractor and JBrowse. In order to help users to fully and efficiently use the TE data of mulberry, we are committed to continuously improve its applications and embed more available TE data of Morus species in the future. MnTEdb will be a valuable resource for research into the comparative and evolutionary dynamics of TEs between Morus and other plants species at the whole genome level.II, Characterization of TEs in mulberry1, A total of125.3MB (37.87%,125.3/330.79) sequences can be annotated as TEs. Among, LTR retrotransposons occupy the major component (29.26%) of TEs, including Copia (10.44%), Gypsy (9.20%), and Lard (8.59%). The proportion of DNA transposons was only8.6%. These results were consistent with some other reported species.2, A total of245scaffolds with length over than scaffold N50, were selected for the correlation analysis of the proportional coverage between TEs and genes. When individual scaffolds were analyzed as units, there was a highly correlated inverse relationship between the coverage of TEs and genes (r=-0.759, p<0.01). Thirty representative scaffolds (ten TE rich scaffolds, ten gene rich scaffolds, and ten scaffolds with similar TE and gene coverage) were selected for the further analysis. We divided the30scaffolds into equally sized50kb windows. The similar results were shown, suggesting that LTR retrotransposons, which takes the most amout in genome, has a great impact on transposon distribution.3,When the RPKM value is more than1across one of five tissues, the gene is considered to be expressed. The correlation analysis between TE coverage and the proportion of expressed genes was performed. A highly correlated inverse relationship was found between the TE coverage and the proportion of expressed genes (r=-0.556, p<0.01). The results suggested that TEs hada great impact on the transcription of adjacent genes.4, Three regions, comparising gene body,2kb region of upstream and downstream of one gene, and2-5kbregion of upstream and downstream of one gene, were selected to analyzewhether a TE had a higher propensity to insert within or close to genes. As shown, some TEs showed a higher propensity to insert within or close to genes. Due to TEs close to genes can become positive regulators of gene expression or may become targets for epigenetic silencing, this insertion pattern might have important implications. 5, Functional annotation and Pathway analysis had been performed for the TE related genes. The biological processes of these genes were mainly involved with metabolic process (37.1%, cellular process (29.6%) and single-organism process (10.7%). The major molecular function terms were binding (46.4%), catalytic activity (39.8%). The cellular component were mainly involved with cell (36.3%), membrane (29.1%), organelle (19.1%) and macromolecular complex (12.5%). KEGG Pathway analysis showed that these genes were involved insat least104pathways. These results suggested that TEs have great impact on host genome.III, Isolation and characterization of reverse transcriptase fragments of LTR retrotransposons1, In total,106clones of Tyl-Copia rt and101clones of Ty3-Gypsy rt were cloned by using degenerate primers from mulberry genome.2, The length of isolated Tyl-copia rt fragments and Ty3-Gypsy rt fragments ranged from240to278bp,408to437bp, respectively. All cloned rt fragments were rich in A+T base. The proportion of A+T base in Tyl-copia rt fragments and Ty3-Gypsy rt fragments ranged from1.16to1.58,1.43to1.55, respectively. The Ty3-Gypsy rt fragments revealed higher sequences similarity (0.535to0.997) compared to Tyl-Copia (0.419to0.992).In conclusion, a high heterogeneity was observed among these rt fragments, and a higher sequence divergence of Tyl-Copia rt in comparison to Ty3-Gypsy rt fragments.3, About53.8%rt fragments of Tyl-Copia were intact sequences. The proportion of intact sequences in Ty3-Gypsy rt fragments was48.5%. The rest had disrupted reading frames. The high heterogeneity of these rt fragments was caused byrandom distribution of frame-shift mutation and premature stop codons.4,Combined with the highly similarity among several rt fragments obtained from the distantly related species, the phylogenetic analysis of rt fragments suggested some horizontal transposon transfer events among different plant species.5, Ka/Ks analysis revealed that all the rt fragments of Tyl-Copia and Ty3-Gypsy were under strong purifying selection. And, a stronger purifying effect was found fromTyl-Copia..6, With the evidences from FISH experiments, both Tyl-Copia and Ty3-Gypsy retrotransposons were preferentially located in pericentromeric heterochromatin of mulberry chromosomes. Fewer hybrid signals were shown in the telomeres regions.Ⅳ, characterization of LTR retrotransposons in mulberry1,In toal,584tRNA sequences had been identified in mulberry genome. Among,553of them can be classified. A total of3892LTR retrotransposons had been identified in mulberry, including Copia (1532,10.12%of genome size), Gypsy (1384,9.07%of genome size), Lard (722,8.59%of genome size) and Trim (254,0.61%of genome size).The44.4%(1728/2892) of LTR retrotransposons formed nested structure.2, The LTR retrotransposons have different PBS sites (primer binding sites). Most LTR retrotransposons use a specific tRNA from their host as a primer for RT. The preference of tRNA usage was different in different family. Among these, tRNAmet, possessed the top preference in Copia (41.2%), Gypsy (28.0%), Lard (13.9%) and Trim (27.2%).3, All LTR retrotransposons hada typical structure, named "TG-CA". The length of LTR and LTR retrotransposons showed a positive correlation (r=0.343,p<0.01). There was a highly correlated positive relationship between the proportions of nested LTR retrotransposons and the average length of LTR retrotransposons in a given superfamily. So, we speculated that the length expansion of LTR retrotransposons in a superfamily was resulted from the formation of nested structure.4, Phylogenetic analyses suggested that Copia and Gypsy superfamily can be spearated according to the similarity of RT and INT domains. The Copia superfamily was probably derived from the ancient elements through inter-element recombination. Alternatively, it’s suggested that the major linkages of Copia and Gypsyexisted.5, Ka/Ks analysis of the RT domains revealed that most of the mulberry TEs in Copia and Gypsy superfamily were under a strong purifying selection. Meanwhile, some TEs were under a positive selection.6, To determine the time that LTR retrotransposons accumulation occurred during evolution, we calculated the insertion time of the TEs. The insertion time trndency between Copia and Gypsy were consistent.Both experienced burst over the last1MYA to2MYA and95%LTRs had inserted into mulberry genomes over the last3million years. On the contrary the insert time of Lard and Trim is scattered. In recent1million to2million years there was a summit, but mostly of them happened in last3million years (Lard,77.1%; Trim,90.2%). Half-life valuation suggested that LTR retrotransposons frequency distribution did not get compliance with negative exponential distribution. This suggested LTR retrotransposons of mulberry might have different evolutionary historywith other spices. During the last3MYA nested LTR retrotransposons have been inserted into mulberry genome and insertion time wasscattered, this made it clear that the formation of nested LTRs was not at a specific time. The distributionfrequency of nested LTR and full-length LTR retrotransposon in different time region had a strong positive correlation (r=0.989,p<0.01). This means in eachperiod, the chance of new inserted LTR retrotransposons forming nested structure was almost the same as that of reinserting into new independent area in genome. This also indicated that the chances of new inserted transposon inserting into inside andoutside of transposon were the similar, without preference existing.7, The activity of a LTR retrotransposon had nothing to do with its copies in genome, supported by the analysis of the copy number and expression of the LTR retrotransposon genes.8, There was an interesting phenomenon existing between the nested LTR and non-nested LTR retrotransposons. Although1728(44.4%) sequences of3892mulberry LTR retrotransposons built nested structure, these nested LTR sequences only took10.4%of genome and full-length LTR retrotransposons with28.4%. This showed that the proportion of nested LTR retrotransposons in genome was much lower than expected (p<0.01).LTR retrotransposons made a great impact on thegenome evolution and genome size expansion. Considering the former results, it proved that LTR retrotransposons which do not form nested structure, have a great influence on genome amplification.
Keywords/Search Tags:mulberry, transposable elements, characterization analysis, transposable elementsdatabase
PDF Full Text Request
Related items