Font Size: a A A

Analyses Of Genome Duplication And Relationships And Alternative Splicing Using High Throughput Sequencing Technology

Posted on:2015-10-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:H F WangFull Text:PDF
GTID:1220330464955397Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
The acquisition of genetic information of organisms quickly and accurately is of the utmost importance in life science research.The carrier of organismal genetic information is the genome (including the nuclear and organellar genomes). Sequencing technologies could accurately assess the genetic information in the genomic DNA. The genome sequences can comprehensively reveal the diversity and complexity of the genome as well as providing clues to the regulation of biological activitiy. Therefore, sequencing technologies play an important role in life science research.The Next Generation Sequencing (NGS) technologies have been widely used in the studies of the genome, transcriptome, methylome, metagenomc and phylogenome. In this study, I investigated the Whole Genome Duplication events (WGDs) among angiosperms by using NGS technology, to obtain evidence for WGDs in the history of dozens of angiosperms in several phylogenetic groups using both public data from scquenced genomes and data from transcriptomes obtained in our lab. The results showed that WGDs are very common among angiosperms, in representatives of both relative large and small groups, indicating that WGD alone are not sufficient for species diversity. I further analyzed functional groups of retained duplicated genes, and found that genes encoding transcription factor, kinase activity and proteins with other functions are significantly enriched in retained duplicated genes. I also found that some functional categories associated with development, environmental response and others are also frequently retained after duplication. These finding supported the idea that genes from WGDs have contributed to species adaption and evolution.In addition to WGDs, alternative splicing (AS) also plays roles in contributing to the diversification of proteome. Compared to animals, studies of AS in plants are still at an early stage. For instance, in the model species Arabidopsis thaliana, AS studies can have important reference value for other plants. Furthermore, flower development is one of the most complicated biological processes and is essential for plant sexual reproduction. Therefore, I used NGS technology RNA-Seq to study AS at different flower development stages in Arabidopsis. Through my comparison between AS at three flower developmental periods, I found hundreds of developmentally regulated AS events and many Novel Transcribed Regions (NTRs). These NTRs can complement the existing Arabidopsis annotations. This study provides a valuable resource and reference for further AS study in plants.NGS technology has contributed to both WGDs and AS studies, and is providing us with an unprecedented opportunity to study metagenomics. By direct sequencing the microorganisms in environmental samples, scientist can obtain the genetic information of many microorganisms, which are difficult or impossible to cultivate in the laboratory. This would help scientists to characterize the interaction among microorganisms. However, with the increasing amount of sequencing data, there is a growing need to rapidly and efficiently analyse the environmental sequencing data. To serve this need, I and another graduate student collaborated to develop an algorithm called MetaCV, which could process environmental data quickly and accurately. MetaCV processes sequencing data directly and without the need to assemble from sequencing reads and provides taxonomic and functional estimation easily. MetaCV adopts component vectors as similarity measurements, which could retain the taxonomic information of the species in the samples, resulting in an accurate and reliable classification. Consequently, MetaCV performed better than other algorithms both on simulated data and real data.In summary, I have studied the WGDs among angiosperms, AS in Arabidopsis thaliana, and co-developed an algorithm for metagenomics, using data from NGS technology. These studies not only extended the applications of NGS, but also provide important references and theoretical basis for other related research.
Keywords/Search Tags:Next Generation Sequencing technology, transcriptome, metagenome, phylogenome, angiosperm, Whole Genome Duplications, flower development, Arabidopsis thaliana, alternative splicing
PDF Full Text Request
Related items