Font Size: a A A

Simultaneous inference in the analysis of gene expression data

Posted on:2006-05-11Degree:Ph.DType:Thesis
University:Bowling Green State UniversityCandidate:Melnykov, IgorFull Text:PDF
GTID:2458390008954157Subject:Statistics
Abstract/Summary:
The gene microarray technology presents a powerful tool for the analysis of genetic information. The data obtained in a microarray experiment is usually used to detect differentially expressed genes. This dissertation contains the investigated results, which enhance statistical data analysis techniques applied in microarray studies, mainly in the following two parts.; The first part deals with hypothesis testing methods arising in the inference on differentially expressed genes. Making simultaneous inference on all genes printed on the array, with strong control of the familywise error at a significance level alpha, necessitates multiplicity adjustments. In this dissertation, a new stepwise testing procedure is proposed, which increases the power of the test by considering sharper bounds for exchangeable test statistics. Another approach that increases the power of multiple testing involves a new terminology, the false discovery rate. In the present work, the convergence of the positive false discovery rate is investigated. It is shown that the convergence occurs under the assumptions milder than those applied in the literature. The proved result allows the estimation of the positive false discovery rate as a posterior probability in the Bayesian setting.; The second part mainly concentrates on the asymptotic inference about the normalized gene expression data. In particular, we investigate the controversial issue on the normalization of the data that eventually affects the convergence of a statistic to a normal random variable. For the case where the normalized data follows a distribution with a zero median, the study through the self-normalized products casts a new light on the asymptotic distribution of the data. We explored the relation between self-normalized products and stable distributions. The result is critical for data analysis regarding heavy-tailed models such as the Cauchy distribution.; Research results presented in this dissertation lead to more efficient techniques for the data analysis of gene expression data, and more accurate procedures for statistical inference.
Keywords/Search Tags:Data, Gene, Inference, False discovery rate
Related items