Font Size: a A A

The Genome Annotation Of Xenopus Tropicalis Based On The 2nd And 3rd Generation Sequencing

Posted on:2018-05-29Degree:MasterType:Thesis
Country:ChinaCandidate:X R HuFull Text:PDF
GTID:2370330515453671Subject:Systems Engineering
Abstract/Summary:PDF Full Text Request
Sequencing technologies have developed with each passing day.The next(second)-generation high-throughput sequencing technology is becoming main stream to be applied to various fields.In the meantime,the third sequencing technology based on single molecule has gradually developed.More and more sequencing projects have been carried out,followed by a large number of different characteristics of the sequences:the next-generation sequencing's accuracy is higher,but the sequencing reads' length is shorter;on the contrary,the length of third generation sequencing reads is longer,but the sequencing error is larger.Fully digging and scientifically integrating the biological information from these two sequencing sequences to get the complete genome annotation are of great significance for the study of transcriptomics.With its own advantages,such as short growth cycle,fast embryonic development and diploid;Xenopus tropicalis becomes important model species in the filed of genomics,genetics and embryology.This paper chooses Xenopus tropicalis as study object,based on the next-generation and the third-generation sequencing data,proposed a complete pipeline to annotate its genome.The pipeline has two parts,transcriptome assembly,genome annotation based on transcriptome assembly.Transcriptome assembly part includes:the next-generation sequencing reads are used for de novo and reference guided transcript assembly;correct the third-generation sequencing reads with next-generation sequencing reads to structure the full length transcripts;the de novo assembled transcripts are assembled with PASA to get the transcriptome.Genome annotation part includes:training the hidden Markov model with ORF extracted from subsets of the assembled transcriptome to predict the gene;integrate different data,such as de novo assembly transcripts,expression sequence tags,homologous transcripts,homologous proteins sequences and gene prediction results,to annotate the Xenopus tropicalis genome.The results show that the annotation pipeline can effectively integrate the biological data from different sources to get the assembled transcriptome,and obtain reliable evidence from them to annotate the amphibian genome information accurately.To compare with other versions of annotation,the annotation from this pipeline has find more genes and coding regions and the lengths of the original genes were expanded.Above all it will lay a good foundation for the future research on genome function annotation,comparative genome analysis,re-sequencing and so on.
Keywords/Search Tags:Sequencing Data, Transcriptome, Genome Annotation
PDF Full Text Request
Related items