Font Size: a A A

Dimensionality of data matrices with applications to gene expression profiles

Posted on:2010-01-31Degree:Ph.DType:Dissertation
University:University of Illinois at Urbana-ChampaignCandidate:Feng, XingdongFull Text:PDF
GTID:1448390002488857Subject:Statistics
Abstract/Summary:
Probe-level microarray data are usually stored in matrices. Take a given probe set (gene), for example, each row of the matrix corresponds to an array, and each column corresponds to a probe. Often, people summarize each array by the gene expression level.;Is one number sufficient to summarize a whole probe set for a specific gene in an array? To answer the question, we propose a multiplicative model with random effects to explain data matrices, and develop tests on the dimensionality of such data matrices. We analyze the asymptotic properties of those test statistics and carry out simulation studies to assess the finite sample performance of the tests. Since the exact distributions of the test statistics are often difficult to obtain, our procedures depend on asymptotic distributions. The asymptotic distributions may not approximate well when the sample size is small. To improve performance, we propose to use the bootstrap techniques in such cases. Furthermore, we discuss the additional assumptions needed to construct confidence intervals for the eigenvalues or their ratios.;We apply the proposed tests to some real probe-level microarray data, and examine some genes that are picked up as significant beyond the uni-dimensional summary of the data matrices. The study leads to some interesting findings for the microarray data.;Finally, a robust extension of the dimensionality tests is discussed, and a real example is used to demonstrate the merit of the robust alternative.
Keywords/Search Tags:Data, Matrices, Gene, Dimensionality, Tests
Related items