Clusters of coexpressed genes in the human and mouse genomes, mutational analysis of SNPS, and DNA microarray methods | | Posted on:2007-03-02 | Degree:Ph.D | Type:Dissertation | | University:George Mason University | Candidate:Moon, Wonjong | Full Text:PDF | | GTID:1444390005463251 | Subject:Biology | | Abstract/Summary: | PDF Full Text Request | | This dissertation is concerned with broad issues of the structure and function of mammalian genomes, particularly regional differences within genomes and gene expression analysis. Previous work has shown that coexpressed genes occur in clusters on the chromosomes of yeast, worms, fruit flies, and humans. Here, human and mouse gene expression data from Affymetrix and cDNA microarray platforms were mapped to their chromosomal positions. In this study, correlation coefficients between the expression patterns of adjacent genes were computed across the entire human and mouse genomes, with different window sizes and P value cutoffs. Clusters of coexpressed genes occurred significantly more often in biological gene order than the randomly shuffled gene order in all examined microarray platforms and species. Moreover, most of the clusters of coexpressed genes contained 2-4 genes regardless of microarray platforms and species. Approximately 12-18% of mammalian genes belonged to clusters of coexpressed genes. Housekeeping genes did not appear to be over-represented in clusters of coexpressed genes, by several definitions of housekeeping genes. The majority of clusters of coexpressed genes were not due to tandem duplications (gene families). However, gene families were over-represented among clusters of coexpressed genes.; In addition to regional differences in gene expression, mammalian genomes also contain significant regional differences in mutation rates. For example, mutations produced by 5-methylcytosine (5mC) deamination are highly dependent on regional GC content. Linear regression analysis showed that the log 10 of the 5mC deamination rates (inferred from human single-nucleotide polymorphisms (SNPs) frequencies) had slopes of -3.0 when graphed with respect to the GC content of neighboring sequences. This is the slope that would be predicted if the correlation between CpG underrepresentation and GC content had been solely caused by DNA melting. Moreover, this same result was obtained regardless of the SNP locations (all SNPs vs. only SNPs in noncoding intergenic regions, excluding CpG islands) and regardless of the lengths over which GC content was calculated (SNP sequences with a modal length of 564 bp versus genomic contigs with a modal length of 163 kb).; Finally, vacuum systems of printers used in DNA microarray manufacture were assessed. The precision and accuracy of microarray technology dictate the resolution of gene expression measurements in the analysis of genome structure and function. (Abstract shortened by UMI.)... | | Keywords/Search Tags: | Coexpressed genes, Genomes, Clusters, DNA, Microarray, GC content, Human and mouse, Snps | PDF Full Text Request | Related items |
| |
|