Machine Learning for Flow Cytometry Data Analysis

Posted on:2012-03-02

Degree:Ph.D

Type:Thesis

University:University of Michigan

Candidate:Lee, Gyemin

Full Text:PDF

GTID:2458390011950299

Subject:Engineering

Abstract/Summary:

This thesis concerns the problem of automatic flow cytometry data analysis. Flow cytometry is a technique for rapid cell analysis and widely used in many biomedical and clinical laboratories. Quantitative measurements from a flow cytometer provide rich information about various physical and chemical characteristics of a large number of cells. In clinical applications, flow cytometry data is visualized on a sequence of two-dimensional scatter plots and analyzed through a manual process called "gating". This conventional analysis process requires a large amount of time and labor and is highly subjective and inefficient. In this thesis, we present novel machine learning methods for flow cytometry data analysis to address these issues.;We first begin by a method for generating a high dimensional flow cytometry dataset from multiple low dimensional datasets. We present an imputation algorithm based on clustering and show that it improves upon a simple nearest neighbor based approach that often induces spurious clusters in the imputed data. This technique enables the analysis of multi-dimensional flow cytometry data beyond the fundamental measurement limits of instruments.;We then present two machine learning methods for automatic gating problems. Gating is a process of identifying interesting subsets of cell populations. Pathologists make clinical decisions by inspecting the results from gating. Unfortunately, this process is performed manually in most clinical settings and poses many challenges in high-throughput analysis.;The first approach is an unsupervised learning technique based on multivariate mixture models. Since measurements from a flow cytometer are often censored and truncated, standard model-fitting algorithms can cause biases and lead to poor gating results. We propose novel algorithms for fitting multivariate Gaussian mixture models to data that is truncated, censored, or truncated and censored.;Our second approach is a transfer learning technique combined with the low-density separation principle. Unlike conventional unsupervised learning approaches, this method can leverage existing datasets previously gated by domain experts to automatically gate a new flow cytometry data. Moreover, the proposed algorithm can adaptively account for biological variations in multiple datasets.;We demonstrate these techniques on clinical flow cytometry data and evaluate their effectiveness.

Keywords/Search Tags:

Flow cytometry data, Machine learning, Technique

Related items

1	Study On Leukocyte Classification Algorithm Contained In Imaging Flow Cytometry
2	Design Of Portable Flow Cytometry Based On ARM
3	Study Of Key Techniques Of Imaging Flow Cytometry For Bio-Particle Detection And Recongnition
4	Research On The Polymer Microfluidic Chip Technology For Flow Cytometry In Spaceflight
5	Research And Implementation Of Indentifying P2P Flow With Machine Learning Based On SVM
6	Scalable data clustering using GPUs
7	Partial Differential Equation Modeling of Flow Cytometry Data from CFSE-based Proliferation Assays
8	Reasearch And Implemention Of Traffic Flow Forecasting Model Based On Machine Learning
9	Case Studies For Semantic Aware Statistical Machine Learning Applications In Code Security Problems
10	Research Of Flow-optimized Technique Applied In Workflow