Font Size: a A A

Quality Accessment Of Insect Transcriptome And Pathway Construction Of Insect

Posted on:2015-11-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z ZhangFull Text:PDF
GTID:1220330482468799Subject:Agricultural Entomology and Pest Control
Abstract/Summary:PDF Full Text Request
Transcriptome is the set of the sequence information of mRNA, which is a very useful gene resources. Here, we analyzed the transcriptome of the Yellow Stem Borer, Scirpophaga incertulas and Rice Leaffolder, Cnaphalocrocis medinalis. The transcriptome of these two insects were comparied with that of Asian honey bee (Apis cerana cerana), Common green bottle fly(Lucilia sericata), Brown planthopper (Nilaparvata lugens) and Whitefly (Bemisia tabaci). Then, we proposed two metrics to assess the quality of transcriptome, the percentage of Contigs containing the intact CDS (CCIC) and median length ratio (LR50). After testing their robustness, two metrics were used to assess six insect transcriptomes. Furthermore, we developed one software named iPathCons, which could be used to construct the pathway from insect transcriptome. A database iPathDB was built, providing an online server to build, search and download the insect pathways.1. Sequencing transcriptome of S. incertulas and C. medinalisWe sequenced the transcriptome of two insects, S. incertulas and C. medinalis. Comparison of the assembly results between Trinity and SOAPdenovo suggested that Trinity were better than SOAPdenovo. So, we used the assembly results of Trinity for further analysis. In total,64,069 contigs of S. incertulas and 73,284 contigs of C. medinalis were obtained. The annotation results showed that only 30% and 50% of the contigs were annotated to be protein coding genes, respectively. The contigs of two insect transcriptomes were grouped into 25 categories by COG analysis.2. Comparative transcriptome analysis of six insectsWe downloaded the raw reads of the four insects from the SRA database of NCBI, and then assembled the transcriptome using Trinity. Four transcriptome were annotated with the nr database. Then, the annotation results of four insects were compared with that of S. incertulas and C. medinalis. Among six insects, the percentage of the contigs that were annotated by the nr database was the highest in L. sericata, followed by N. lugens, A. C. cerana, C. medinalis and S. inceriulas. The lowest was B. tabaci. We also calculated the length ratio of the contigs with their orthologous. The contigs whose length ratios are between 0.9 and 1.1 are considered as CCIC. In total,2,258,2,058,3,626,5,053,2,094 and 2,173 CCIC were found in the S. incertulas, C. medinalis, A. C. cerana, L. sericata, N. lugens and B. tabaci, respectively. GO analysis were carried out by Blast2go. As a result, "Developmental process" in "Biological Process" of L. sericata had the highest percentage (44%), which may be related to the short life cycle of the fly. The sequence number of P450 gene in S. incertulas, C. medinalis, N. lugens and B. tabaci were more than other two insects. This may be caused by environment stress and pesticides selection of the agricultural pests. There are more OBP genes in the flies than in other insects. Sid-2 gene was found in five insects.3. Assessing the quality of transcriptomeNowadays, several metrics have been proposed to assess the transcriptome assembly. These metrics mainly consider the depth or the coverage of raw data, assembly length and the contigs numbers. However, there are some disadvantages. Here, we proposed two metrics for quality assessment of transcriptome, the percentage of CCIC and LR50. The transcriptome contigs were classified into four categories, the contigs containing intact CDS, the fragment with 5’UTR, the fragment with 3’UTR and CDS fragment. The primary purpose of transcriptome sequencing is to get a list of gene transcripts as intact as possible. The transcriptome of A. C. cerana, L. sericata, N. lugens and B. tabaci were used as the test data. The assembly results of five different software were evaluated, indicating that the percentage of CCIC and LR50 are robust. The transcriptome of six insect transcriptomes were evaluated by two metrics. The results indicated that the qualities of S. incertulas and C. medinalis transcriptome are low.4. Constructing the pathway from insect transcriptomeWe downloaded the pathway information of 20 insects from the KEGG. Then, the pathway of other fifteen genome-avaliable insects were constructed. In total, we obtained the pathway information of 35 insects, which were used as the pathway template. A software named iPathCons was developed for building pathways from insect transcriptome. The pathway of S. incertulas, C. medinalis, A. C. cerana, L. sericata, N. lugens and B. tabaci were constructed using the iPathCons. Totally,15,328,16,456,10,270,11,871, 11,284 and 8,700 contigs were annotated. In total,72% of disease-associated pathways were found in insects, suggesting that insects can be used to model human diseases. The pathway "Starch and sucrose metabolism" in A. C. cerana were found to be complete, showing that the genes related with glucose metabolism are expressed highly in the honeybee. We also downloaded the genes related with wing development. The wing development pathway were built in the 35 genome-published insects. The results indicated that 16 insects lost Ser, six insects lost Vg, and four insects lost Ser and Vg.5. Insect pathways database iPathDBWe constructed an insect pathway database, named as iPathDB. In total,52 insects in six insect Orders,12,074 pathways,98,813 genes and 414,895 sequences were collected in iPathDB. We also built a website for searching, submitting and analyzing insect pathways. An online server of constructing insect pathway was also provided.
Keywords/Search Tags:Scirpophaga incertulas, Cnaphalocrocis medinalis, Transcriptome Quality Assessment, Pathway Construction, iPathCons, iPathDB
PDF Full Text Request
Related items