Font Size: a A A

Statistical approaches for the analysis of matched mRNA microarray data from degraded tissues with application to unfrozen archived newborn blood spots from a case-control study of children with cerebral palsy

Posted on:2013-04-08Degree:Ph.DType:Dissertation
University:Michigan State UniversityCandidate:Ho, Nhan ThiFull Text:PDF
GTID:1454390008478896Subject:Epidemiology
Abstract/Summary:
Cerebral palsy (CP) describes a group of defects that are caused by damage to the motor-controlling centers of the brain. This damage occurs either during pregnancy, during childbirth, or in early infancy. Currently the etiology of CP is unclear but has been speculated to arise from hypoxia, infection and other influences. In this matched case-control study in children aged from 2-16 years, we examined the mRNA expression patterns in blood for evidence of exposure to agents that have been associated with the development of CP. The prospective collection of newborn blood samples derived from CP cases and matched controls is not practical while archived unfrozen dried neonatal blood spots (uDNBS) have been showed to preserve a sufficient amount of mRNA to perform mRNA microarray analysis. Therefore, we utilized previously collected uDNBS for genome-wide expression profiling.;mRNA expression data was derived from a set of 106 uDNBS, which represented 53 subjects that subsequently developed CP and 53 age, gestational- age and gender- matched control subjects. Established methods for processing and analyzing of microarray data were used to study evidence of changes in gene expression between cases and controls. The analysis focused on a gene set-based approach prioritizing seven pre-selected gene sets representing four major hypothesized pathophysiologic pathways of CP, i.e. inflammation, thyroid disorders, hypoxia/asphyxia, and coagulation disorders. The empirical inflammatory and hypoxic gene sets were significantly down-regulated while the empirical thyroidal gene set appears significantly up-regulated. The analysis of gene sets from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database also revealed some significant inflammatory related gene sets. Gestational age and CP type had interactive effects on the expression pattern of the three significant empirical gene sets.;Several important technical and theoretical concepts were also evaluated in detail. First, the time-dependent degradation of mRNA, or the difficulty in extracting mRNA from uDNBS over time, is inevitable, and this may affect the technical quality of microarray data produced from uDNBS. Thus, the quality issues of microarray data need to be taken into account when processing and analyzing microarray data from uDNBS. Further evaluation of the quality of microarray data over time showed that differential expression at individual gene and gene set level could be seen better in uDNBS of less than six years old. The proposed approach for selecting housekeeping genes helped pick up six potential housekeeping genes which can be used for quantitative polymerase chain reaction (qPCR) assays to validate microarray data.;Second, the published literature for gene set analysis of matched case-control study design is meager, and existing microarray analysis methods may not function properly. Thus, the performance of existing methods was evaluated and new approaches have been developed to address many methodological aspects of gene set analysis of matched microarray data. Both the published GAGE (generally applicable gene set enrichment for pathway analysis) method and the proposed ZZ-GSA (two stage z-test for gene set analysis) approach can be used for gene set analysis of matched microarray data although each has some strengths and limitations especially in term of power and type I error.
Keywords/Search Tags:Microarray data, Matched, Mrna, Case-control study, Gene set, Blood, Approach
Related items