Font Size: a A A

Application of hierarchical models in microarray data analysis

Posted on:2006-11-05Degree:Ph.DType:Dissertation
University:The Johns Hopkins UniversityCandidate:Liu, DongmeiFull Text:PDF
GTID:1454390005995646Subject:Biology
Abstract/Summary:
To screen differentially expressed genes, we propose four novel models that empirically investigate the impact of two critical choices: the specification of the goals of the selection procedure, and the specification of a dependence structure across genes.; To make inference on functional classes, we propose a hierarchical model with two variations to study the association between disease and functional classes of genes. The method is based on the idea of Bayesian variable selection. Our approach consists of gene-level model, class-level model and correlation structures among functional classes. The model takes a vector of summary statistics which measures the gene-specific disease association as input. The gene-level model is a multiple regression with the summary statistics as dependent and the functional classes indicators as covariates. The class-level model assign priors to the class effects summarized in the gene-level model. A latent variable is incorporated into the prior to identify disease-associated classes. Correlations among functional classes is included in the model by a covariance matrix on the classes effects. The model gives a nice interpretation to the association between disease and functional classes. It provides both a qualitative result, the probability for a class to be disease associated, and a quantitative result, the average differential expression of the genes in a class.; The dissertation is closed by comparing various approaches that are available to test disease associations with functional classes. Most current approaches are enrichment tests that use dichotomized disease association measures and ask the question of whether classes are overrepresented in a given gene list. This question is asking among all functional classes, which one has more differentially expressed genes than the average percentage of differentially expressed genes in the whole genome. A more biologically relevant test would compare the distribution of expression values of the genes in a functional classes in disease samples to the same distribution in normal samples. The disease associated functional classes would show differences between the two distributions. We propose a disease association test to perform the second type test. Preliminary simulation studies shows that the enrichment test can sometimes miss the disease associated functional class when there is moderate differential expression in the class, but not significant enough to change the percentage of top differential genes the class has. When there is only a small percentage of genes in a disease associated class that are significantly up or down regulated, the enrichment test is good to detect it, disease association test appears to be insensitive. It is very important to understand which association scenario the data belong to before making decisions which test to use to find association between disease status and functional classes. Choosing the wrong test for the data in hand could result in missing important associations that functional classes have with disease. (Abstract shortened by UMI.)...
Keywords/Search Tags:Model, Functional classes, Disease, Differentially expressed genes, Data, Association, Test
Related items