Font Size: a A A

Metric Learning Revisited New Approaches for Supervised and Unsupervised Metric Learning with Analysis and Algorithms

Posted on:2013-09-22Degree:Ph.DType:Thesis
University:McGill University (Canada)Candidate:Abou-Moustafa, Karim TamerFull Text:PDF
GTID:2458390008969939Subject:Applied mechanics
Abstract/Summary:
In machine learning one is usually given a data set of real high dimensional vectors X , based on which it is desired to select a hypothesis &thetas; from the space of hypotheses Θ using a learning algorithm. An immediate assumption that is usually imposed on X is that it is a subset from the very general embedding space Rp which makes the Euclidean distance || · ||2 to become the default metric for the elements of X . Since various learning algorithms assume that the input space is Rp with its endowed metric || · ||2 as a (dis)similarity measure, it follows that selecting hypothesis &thetas; becomes intrinsically tied to the Euclidean distance.;Metric learning is the problem of selecting a specific metric dX from a certain family of metrics D based on the properties of the elements in the set X . Under some performance measure, the metric dX is expected to perform better on X than any other metric d ∈ D . If the learning algorithm replaces the very general metric || · ||2 with the metric dX , then selecting hypothesis &thetas; will be tied to the more specific metric dX which carries all the information on the properties of the elements in X .;In this thesis I propose two algorithms for learning the metric dX ; the first for supervised learning settings, and the second for unsupervised, as well as for supervised and semi-supervised settings. In particular, I propose algorithms that take into consideration the structure and geometry of X on one hand, and the characteristics of real world data sets on the other. However, if we are also seeking dimensionality reduction, then under some mild assumptions on the topology of X , and based on the available a priori information, one can learn an embedding for X into a low dimensional Euclidean space Rp0 , p0 << p, where the Euclidean distance better reveals the similarities between the elements of X and their groupings (clusters). That is, as a by-product, we obtain dimensionality reduction together with metric learning.;In the supervised setting, I propose PARDA, or Pareto discriminant analysis for discriminative linear dimensionality reduction. PARDA is based on the machinery of multi-objective optimization; simultaneously optimizing multiple, possibly conflicting, objective functions. This allows PARDA to adapt to the class topology in the lower dimensional space, and naturally handles the class masking problem that is inherent in Fisher's discriminant analysis framework for multiclass problems. As a result, PARDA yields significantly better classification results when compared with modern techniques for discriminative dimensionality reduction.;In the unsupervised setting, I propose an algorithmic framework, denoted by X (note the different notation), that encapsulates spectral manifold learning algorithms and gears them for metric learning. The framework X captures the local structure and the local density information from each point in a data set, and hence it carries all the information on the varying sample density in the input space. The structure of X induces two distance metrics for its elements, the Bhattacharyya-Riemann metric dBR and the Jeffreys-Riemann metric dJR . Both metrics reorganize the proximity between the points in X based on the local structure and density around each point. As a result, when combining the metric space ( X,dBR ) or ( X,dJR ) with spectral clustering and Euclidean embedding, they yield significant improvements in clustering accuracies and error rates for a large variety of clustering and classification tasks.
Keywords/Search Tags:Metric, Algorithms, Supervised, Dimensionality reduction, PARDA
Related items