LFDA: A probabilistic graphical model for the study of excitation emission matrices

Posted on:2017-06-17

Degree:Ph.D

Type:Dissertation

University:University of Miami

Candidate:Martinez, Oscar Luis

Full Text:PDF

GTID:1448390005976256

Subject:Computer Science

Abstract/Summary:

Traditional classification techniques assume samples are described by vectors of features. However, in some domains samples are gathered by measuring a variable with respect to two or more other variables: for a given value of x and y measure z. In such domains, samples are more naturally described by matrices or by higher dimensional arrays.;We present a novel latent Dirichlet allocation (LDA)-based approach for modeling and analyzing fluorescent spectroscopy excitation-emission Matrices (EEMs) and other three way datasets. We introduce parallels between topic modeling and three-way arrays which allow us to create adaptations to use LDA-based methods in latent fluorophore studies. The proposed framework views the EEMs as being generated from an underlying hidden pool of flourophore compounds, and provides a latent flourophore-space representation of an EEM. We show that this LDA-based model can increase classification performance, especially when paired with parallel factor analysis (PARAFAC) which may be regarded as perhaps the most popular and widely used tool for dealing with EEMs. Our experiments show that the proposed LDA-based algorithm is in some cases more robust than PARAFAC to certain types of noise and data disturbances. We also observe that pairing this LDA-based method with PARAFAC leads to an improvement in classification performance and to added robustness at high peak-signal-to-noise-ration (PSNR) values.;We also present an extended graphical model that incorporates the effect of outside variables that may affect fluorescent expression of certain compounds. The extended model offers further insight into the interaction between these variables and the latent fluorophore components while facilitating the model building process.;The performance of machine learning algorithms is known to be impaired if the representation of the individual classes in the training set is imbalanced, i.e., one class outnumbering the other class(es). Such is the case for several experiments in this proposal. Many approaches to deal with this problem have been developed, none of them totally satisfactory. Here we propose membership-based minority oversampling (MeMO), as yet another possible solution, and explores, experimentally, the conditions under which it outperforms earlier attempts.;Finally we introduce a Dempster-Shafer based fusion model that is intended to adaptively merge the PARAFAC and LDA-based models when their outputs are being used for classification purposes.

Keywords/Search Tags:

Model, Classification, PARAFAC, Lda-based

Related items

1	Blind Signal Processing Problem In Wireless Communication System Based On PARAFAC Model
2	Research On Channel Estimation Of Massive MIMO System Based On PARAFAC Model
3	Blind Sources Separation For Polarization Sensitive Array Based On PARAFAC
4	Array Parameter Estimation Based On PARAFAC Analysis
5	Research On Coherent DOA Estimation Algorithm In Complex Electromagnetic Environment
6	The Research On Multiuser Detection Technology In DS-CDMA Communication System
7	Symbiosis Local Binary Model And Its Application
8	DOA Estimation Based On PARAFAC Technology
9	Research On XML Document Management And Classification-based Retrieval Technology In Web
10	Research On Probability Statistical Model For Image Classification