Font Size: a A A

Preprocessing and differential expression analysis for Affymetrix GeneChip arrays

Posted on:2006-12-30Degree:Ph.DType:Dissertation
University:University of California, DavisCandidate:Zhou, LeiFull Text:PDF
GTID:1458390008953588Subject:Statistics
Abstract/Summary:
The GLA algorithm for preprocessing Affymetrix GeneChip probe-level data has been developed based on the generalized logarithm transformation. The algorithm handles background correction, normalization, transformation and probe-set summarization. One product of applying the algorithm to probe-level data is the GLA expression index. The GLA expression index is evaluated in terms of precision and accuracy by comparing it to other widely accepted expression indices in terms of its accuracy and precision. The GLA expression index is shown to be a highly reproducible expression index that provides the consistent estimate of the fold change and has high specificity and sensitivity to detect differential expression using fold change bigger than some cutoff value as identification rule. Thus the GLA algorithm is a good tool for this purpose.; One of the most important applications of gene expression array analysis is to identify sets of biologically significant genes. This task can be carried out on some version of expression indices by fitting statistical models to each gene. There is an emerging trend of carrying out the differential expression analysis to each probe-set on appropriately background corrected, normalized and transformed probe-level intensities. We have applied the GLA algorithm here for preprocessing purposes for both situations. The expression-level modeling approach and the probe-set modeling approaches including the fixed effect modeling and the mixed effect modeling are explored in great detail. We have concluded that both an expression-level model with the empirical Bayes correction and a probe-set fixed effect model are good choices in conducting differential expression analysis on Affymetrix GeneChip array data and that the expression-level model with the empirical Bayes correction is a simple solution with a similar level of power and relatively lower false positive rate compared to the probe-set fixed effect models.; Finally, since the VSN algorithm proposed by Wolfgang Huber et al. uses essentially the same data transformation function as the GLA algorithm does, comparisons have been made between the two algorithms. The GLA algorithm again has been proved to be superior to the VSN algorithm in every important aspect.; This dissertation delivers a strong message to Affymetrix GeneChip users that the GLA algorithm is among one of the very competitive candidates with the purpose of preprocessing Affymetrix probe-level data and will contribute to high quality down-stream statistical analysis.
Keywords/Search Tags:Affymetrix, GLA algorithm, Preprocessing, Expression, Probe-level data
Related items