Font Size: a A A

Integrating gene expression and metabolic profiles to optimize cellular functions

Posted on:2007-07-16Degree:Ph.DType:Dissertation
University:Michigan State UniversityCandidate:Li, ZhengFull Text:PDF
GTID:1444390005973062Subject:Biology
Abstract/Summary:
With advances in high throughput technology, profiles of gene expressions, proteins and metabolites can be acquired to help elucidate the network of pathways involved in producing a specific phenotype. This dissertation presents a systems approach that was developed to integrate gene expression, metabolic and phenotypic profiles to identify active pathways that confer a phenotype. The approach involves several separate components to (i) identify genes that are relevant to a cellular or metabolic process, (ii) integrate multi-source information, and (iii) reconstruct pathways and networks.; Approaches to identify genes relevant to a phenotype were developed using genetic algorithm coupled partial least squares analysis (GA/PLS) and discussed in Chapter 2. GA/PLS used a log linear model to identify subsets of genes that can best predict a phenotype e.g. a metabolic function. Next we applied Bayesian network analysis to infer network structures from metabolic data, discussed in Chapter 3. Metabolic data was chosen initially because if Bayesian network analysis is able to infer well-known metabolic structures, pathways e.g. TCA cycle and urea cycle, from experimental data, which provided confidence in the ability of this methodology to infer other networks, such as, genetic regulatory networks from gene data. In Chapter 4, we integrated both ideas from the previous two chapters into a Three-stage Integrative Pathway Search (TIPS(c)), which combined methods to identify relevant genes with network reconstruction. Unlike other approaches, this approach identified the active pathways without requiring interaction measurements or libraries of genetic mutants, and with limited amount of data. The reconstructed network was validated through in silico perturbations studies with published results and further experiments. The framework provided very good predictions of the effect of some, but not all, of the perturbation studies. This may be due in part to Bayesian network analysis' inability to handle transients, such as cycles and feedback loops. Uncovering the additional information from these transients would be valuable in elucidating the mechanism involved in producing a particular phenotype. Therefore, in Chapter 5, we developed a dynamic model using time-series gene expression data. To illustrate the ability of the model, we applied the model to Escherichia coli K12 (E. coli) and Saccharomyces cerevisiae (yeast), which were more readily amendable to validatation. Finally, Chapter 6 discussed the improvements to the current TIPS (c) approach namely, to include multiple metabolites, incorporate multi-source information, and dynamic modeling for time-series data.
Keywords/Search Tags:Gene expression, Metabolic, Profiles, Data, Model, Approach
Related items