Font Size: a A A

Data mining approaches for interpreting protein subcellular location patterns in fluorescence microscope images

Posted on:2005-09-06Degree:Ph.DType:Dissertation
University:Carnegie Mellon UniversityCandidate:Huang, KaiFull Text:PDF
GTID:1450390008494150Subject:Biology
Abstract/Summary:
Systematic exploration of proteome, the entire set of proteins expressed in a cell, has become one of the main focuses of life science in the post-genome era. The properties of a protein include its amino acid sequences, its expression levels under various developmental stages and in different tissues, its 3D structure and active sites, its functional and structural binding partners, and its subcellular location. Knowledge of a protein's subcellular distribution can contribute to a complete understanding of its function in a number of different ways. Among all approaches for determining protein subcellular location, fluorescence microscopy permits rapid collection of images with excellent resolution between cell compartments. These properties, along with the high specificity methods for targeting fluorescent probes to specific proteins, make fluorescence microscopy the optimal choice for studying the subcellular distribution of a proteome.; The Murphy Lab has built an automated system to classify and cluster protein subcellular location patterns depicted in fluorescence microscope images using numerical features. This work extends automated classification of protein subcellular distribution patterns in four directions: comparing eight feature reduction methods including PCA, nonlinear PCA, ICA, KPCA, classification tree, fractal dimensionality reduction, genetic algorithm, and stepwise discriminant analysis; improving the classification of major subcellular patterns depicted in both 2D and 3D fluorescence microscope images by developing new image features and constructing ensemble classifiers from individual classifiers such as support vector machines, AdaBoost, Bagging, neural network, and Mixtures-of-Experts; extending the current approaches that work on single-cell images to multi-cell images by selecting image features that are invariant to cell numbers and developing cell segmentation algorithms guided by moderate single-cell classifiers; building a complete database schema structuring 2D--5D fluorescence microscope images and a fully automated image retrieval system featuring both annotation- and content-based query built upon the Oracle platform with J2EE and web interface.; This work illustrates an example of applying advanced computer vision and pattern recognition techniques to digital images generated from quantitative microscopy. An objective, accurate, and high-throughput system is necessary for reliable and robust image interpretation in biomedical optics applications. The tools and methods resulting from this work, along with high-throughput imaging hardware, can be used to determine the subcellular location of every protein expressed in a certain cell type.
Keywords/Search Tags:Protein, Subcellular location, Fluorescence microscope images, Patterns, Approaches
Related items