Font Size: a A A

Study On Reconstruction Transcript And Alternative Splicing In Silkworm

Posted on:2019-08-25Degree:MasterType:Thesis
Country:ChinaCandidate:J F FuFull Text:PDF
GTID:2370330566980312Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
Alternative splicing is widespread in organisms.It acts as a gene-generating protein diversity and plays an important role in a series of biological processes such as immune response,cell differentiation,sex determination,neural development,and biological evolution.In the nucleus,genes are first transcribed into precursor mRNAs and then splice to produce mature mRNAs.In this process,multiple transcripts are often generated by different splicing patterns.With the development of high-throughput sequencing technologies and bioinformatics,the identification of genome-wide alternative splicing events in many model organisms reveals temporal and spatial complexity of gene transcription.As a representative of Lepidoptera model insects,Bombyx mori is an important model for Lepidoptera genomics research.In recent years,transcriptome sequencing of the silkworm has produced a large number of tissues and developmental samples.However,there is still a lack of data mining,annotation,and expression pattern analysis of the whole genome level transcripts of the silkworm.Therefore,this study collected the transcriptome data of the silkworm,combined with bioinformatics methods,reconstructed the whole genome-wide transcripts of the silkworm;classified and annotated the reconstructed transcripts;predicted new genes and new lncRNAs;the whole genome of the silkworm levels of gene alternative splicing events were identified,and the types and distribution of alternative splicing events were counted;the tissue-specific and period-specific expression of alternative splicing events was analyzed,as well as the visualization of alternative splicing event expression patterns.The main research results obtained are as follows:1.Reconstruction of silkworm transcriptsIn order to reconstruct transcripts of the silkworm,we collected data of silkworm RNA-seq published in recent years from public database.After filtering and screening,58 samples were obtained,which were derived from 12 tissues and organs.These included fat body,brain,posterior silk gland,ovary,and testis and so on.Second,using Hisat2 to mapping each sample to the silkworm genome.And then,referring to the annotated gene set,using Stringtie to assembly transcripts for qualified samples,to obtain reconstructed transcripts of each tissue sample,the number of transcripts from19,000 to 40,000.Finally,all transcripts were merged using ‘Stringtie merge',obtaining a total transcript,and the size of the transcript was 64,372,of which 26,450 were from predicted genes.There were an average of 2 transcripts per gene,which provided the basis for the study of alternative splicing of genes.In addition,the reconstructed transcripts were compared with the annotated genes for classification,and the transcripts length and number of exons were compared among them.The results showed that the number of exon transcripts of the reconstructed transcript was 171,573 and the number of single exons was 10,505.In the annotated transcript,the number of multiple exons was 79,442 and the number of single exons was 3,387,indicating that the reconstructed transcripts included new exons or new genes,providing data basis for alternative splicing analysis of whole genes..2.Reconstruct transcript annotation and analysisTo analyze the composition of transcripts,such as new gene transcripts,non-coding RNA transcripts,etc.,the reconstructed transcripts are annotated.First,comparing the reconstructed transcript with the annotated genes,15,721 known transcripts were obtained,37,831 transcripts were from the annotated genes,and 10,822 were transcripts of unknown genes.Second,new genes that may be present in unknown transcripts were predicted,we obtained 189 new genes and it contained 246 transcripts.Then,the new lncRNAs in unknown transcripts were predicted and analyzed,and 494 new lncRNAs were obtained.Finally,WGCNA co-expression analysis was performed on the FPKM values of the coding gene and the new lncRNA expressed in 16 samples,and 18 lncRNAs and their associated coding genes were co-expressed.The prediction of new genes and new lncRNAs will help improve the study of silkworm transcripts and non-coding RNAs.3.Identification of whole genome alternative splicing events in silkwormTo identify gene alternative splicing events at the genome level,first,gtftogenePred was used to convert the file format of the reconstructed transcripts to obtain exon splice site information.Secondly,using Rnaseqlib to identify alternativesplicing events,a total of 18,468 alternative splicing events were obtained at the whole genome-wide level,including 5 splice types: 5399 for exon skipping events and 4622 for variable 3' splice site events.There were 4,110 variable 5'splice site events,1,237 mutually exclusive exon events,and 3,100 retained intron events.Finally,the alternative splicing events were analyzed on each chromosome of the silkworm,and the distribution of each event on each chromosome was between 321 and 1193.Among them,the splicing event on chromosome 12 is mainly composed of mutually exclusive exon events,and the BMgn010363 gene on this chromosome generates a large number of mutually exclusive exon events.The BMgn010363 gene has 34 conserved exons and58 alternative splicing exons,which constitute 4 variable regions and 3 independent spliced exons,produces 46 isomer transcripts.The gene is homologous to the insect DSCAM gene,and both have the feature of producing a large number of isomers through alternative splicing.The identification of alternative splicing at the genome-wide level in the silkworm will provide data support for studying the isomers and functions of genes.4 Tissue-specific,time-specific alternative splicing event analysis and visualization of expression patternsThe analysis of alternative splicing events at the level of the whole gene of silkworm,the expression pattern of alternative splicing genes was analyzed.The comparison of expression levels revealed that the expression of alternative splicing genes was significantly higher than that of non-alternative splicing genes.MISO was used to calculate the temporal and spatial expression patterns of 4 embryonic and 6larval tissue samples.The PSI values of alternative splicing gene expression were obtained and cluster analysis was performed.The Jensen-Shannon score algorithm was used to calculate the significance of specific splicing events for each sample.The specific splicing events for each sample in the egg phase were between 437 and 637,with the 24 h sample being the largest,for example,the BMgn011842 gene has a specific splicing expression at 0 h.The number of specific splicing events in various tissues and organs ranged from 0 to 819.Among them,the testis was the most abundant.For example,the BMgn003856 gene was specifically splicing in the testis.The results of specific splicing events indicate that there are spatio-temporal specificity in gene alternative splicing events in different tissues and at different developmental stages of silkworm,and the specifically splicing genes will produce different proteins,thisprovides a reference for studying the function of gene biology in different tissues and stages of development.
Keywords/Search Tags:silkworm, reconstructed transcript, alternative splicing, DSCAM, LncRNA
PDF Full Text Request
Related items