Font Size: a A A

Bioinformatics Analysis Of Expressed Sequence Tags (EST) In Several Mammals

Posted on:2007-03-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z X SuFull Text:PDF
GTID:1100360215959596Subject:Biology
Abstract/Summary:PDF Full Text Request
Since the original description of 609 Expressed Sequence Tags (ESTs) by Adams et al. in 1991, the concept of large-scale EST sequencing has been more and more universally recognized, and the technology of large-scale ESTs sequencing has been more and more widely used. Up to now, huge ESTs data have been produced and submitted to the public database such as dbEST and Unigene. ESTs, arising from partially sequencing of cDNAs, are now widely used throughout the genomics and molecular biology communities for gene discovery, mapping, polymorphism analysis, expression studies, and gene prediction. In this study, starting with the EST sequences pre-processing, assembling, clustering, annotation and functional classification, we focused on a few applications of ESTs in the vertebrate comparative evolutionary genomics.Chapter II summarizes an experiment of large-scale EST sequencing of porcine mammary gland and a comprehensive bioinformatics analysis. A total of 28,941 ESTs were se.quenced from five 5'-directed non-normalized cDNA ?libraries, which were assembled into 2212 contigs and 5642 singlets using CAP3. These sequences were annotated and clustered into 6857 unique genes, 2072 of which have no functional annotations were considered as novel genes. These genes were further classified into Gene Ontology categories. By comparing the expression profiles, we identified some breed- and developmental-stage-specific gene groups. These genes may relative to reproductive performance or play important roles in milk synthesis, secretion and mammary involution. The unknown EST sequences and expression profiles at different developmental stages and breeds are very important resources for further research.In chapter III, we predicted the alternative splicing (AS) forms using the EST sequences, and then extend the study to the evolution of AS after gene duplication. We observed that duplicate genes have fewer AS forms than that of single-copy genes, and that a negative correlation exists between the mean number of AS forms and the gene family size. Interestingly, we found that the loss of alternative splicing in duplicate genes may occur shortly after the gene duplication. These results support the subfunctionization model of alternative splicing in the early stage after gene duplication. Further analysis of the alternative splicing distribution in human duplicate pairs showed the asymmetric evolution of alternative splicing after gene duplications, i.e., the AS forms between duplicates may differ dramatically. We therefore conclude that alternative splicing and gene duplication may not evolve independently. In the early stage after gene duplication, young duplicates may take over a certain amount of protein function diversity that previously was carried out by the alternative splicing mechanism. In the late stage, the gain and loss of alternative splicing seem to be independent between duplicates.Chapter IV introduces the study of the gene regulatory evolution using the expression profile data based on EST counts in Unigene database. The study of gene regulation evolution is not only of interest in an evolutionary context but also promises to shed light on the contribution of regulatory region variation to human disease. One major approach of gene regulation evolution is to start at the phenotypic level and analyze variation in pattern of gene expression. The availability of huge transcriptome data produced from various tissues in various organisms, which are based on oligonucleotide microarray technology, makes it possible to study the gene regulation expression. However, since the microarray data have its intrinsic defects, many previous studies in this field have obtained the contradictive conclusions. In this study, using the expression profile data extracted from Unigene database, we analyzed the gene expression evolution comprehensively. We verified parts of previous controversial results and also found some new observations. We developed a new measure of gene expression profile divergence (Ug). Based on Ug, we analyzed the expression evolution between human duplicate genes, human-mouse orthologs or between orthologous tissues. We found that the evolution of gene expression and gene sequence are coupled; both of them may evolve under natural selection. But there are no correlations between gene expression and gene expression specificity or gene expression level. We further constructed the tissue expression dendrograms of 15 human and 15 mouse tissues. We then suggested that we should consider two different factors, the evolution distance caused by speciation (De) and the evolution distance caused by tissue development (Dd), when we constructing and analyzing the tissue expression dentrograms.
Keywords/Search Tags:expressed sequence tags, EST, domestic pig, mammary gland, alternative splicing, gene duplication, expression profile, gene expression pattern, gene expression evolution
PDF Full Text Request
Related items