Font Size: a A A

Large-scale data analysis of gene expression maps obtained by voxelation

Posted on:2013-10-11Degree:Ph.DType:Thesis
University:Temple UniversityCandidate:An, LiFull Text:PDF
GTID:2450390008467945Subject:Biology
Abstract/Summary:
Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological diseases, and gene expression profiles have been widely used in functional genomic studies. However, not much work in traditional gene expression profiling takes into account the location information of a gene's expressions in the brain. Gene expression maps, which are obtained by combining voxelation and microarrays, contain spatial information regarding the expression of genes in mice's brain. We study approaches for identifying the relationship between gene expression maps and gene functions, for mining association rules, and for predicting certain gene functions and functional similarities based on the gene expression maps obtained by voxelation.;First, we identified the relationship between gene functions and gene expression maps. On one side, we chose typical genes as queries and aimed at discovering the groups of the genes which have similar gene expression maps to the queries. Then we study the relationship between functions and maps by checking the similarities of gene functions in the detected gene groups. The similarity between a pair of gene expression maps was identified by calculating the Euclidean Distance between the pair of feature vectors which were extracted by wavelet transformation from the hemispheres averaged gene expression maps. Similarities of gene functions were identified by Lin's method based on gene ontology structures. On the other side, we proposed a multiple clustering approach, combined with hierarchical clustering method to detect significant clusters of genes which have both similar gene functions and similar gene expression maps. Among each group of similar genes, the gene function similarity was measured by calculating the average pair-wise gene function distance in the group and then ranking it in random cases. By finding groups of similar genes toward typical genes, we were able to improve our understanding of gene expression patterns and gene functions. By doing the multiple clustering, we obtained significant clusters of similar genes and very similar gene functions respectively to their corresponding gene ontologies. The cellular component ontology resulted in prominent clusters expressed in cortex and corpus callosum. The molecular function ontology gave prominent clusters in cortex, corpus callosum and hypothalamus. The biological process ontology resulted in clusters in cortex, hypothalamus and choroid plexus. Clusters from all three ontologies combined were most prominently expressed in cortex and corpus callosum. The experimental results confirm the hypothesis that genes with similar gene expression maps have similar gene functions for certain genes.;Based on the relationship between gene functions and expression maps, we developed a modified Apriori algorithm to mine association rules among gene functions in the significant clusters. The experimental results show that the detected association rules (frequent itemsets of gene functions) make sense biologically. By inspecting the obtained clusters and the genes having the same frequent itemsets of functions, interesting clues were discovered that provide valuable insight to biological scientists. The discovered association rules can be potentially used to predict gene functions based on similarity of gene expression maps.;Moreover, proposed an efficient approach to identify gene functions. A gene function or a set of certain gene functions can potentially be associated with a specific gene expression profile. We named this specific gene expression profile, Functional Expression Profile (FEP) for one function, or Multiple Functional Expression Profile (MFEP) for a set of functions. We suggested two different ways of finding (M)FEPS, a cluster-based and a non-cluster-based method. Both of these methods achieved high accuracy in predicting gene functions, each for different kinds of gene functions. Compared to the traditional K-nearest neighbor method, our approach shows higher accuracy in predicting functions. The visualized gene expression maps of (M)FEPs were in good agreement with anatomical components of mice's brain;Furthermore, we proposed a supervised learning methodology to predict pair-wise gene functional similarity from gene expression maps. By using modified AdaBoost algorithm coupled with our proposed weak classifier, we predicted the gene functional similarities between genes to a certain degree. The experimental results showed that with increasing similarities of gene expression maps, the functional similarities were increased too. The weights of the features in the model indicated the most significant single voxels and pairs of neighboring voxels which can be visualized in the expression map image of a mouse brain.
Keywords/Search Tags:Expression, Gene, Brain, Obtained, Association rules, Clusters
Related items