Font Size: a A A

Research On Metric Learning Methods For Heterogeneous And Multi-output Data

Posted on:2019-05-31Degree:MasterType:Thesis
Country:ChinaCandidate:R QiFull Text:PDF
GTID:2428330599950156Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the information age,the amount of mixed and multi-output data in the fields of healthcare,multimedia retrieval and scientific research has grown rapidly.The tasks of clustering,classification or regression composed of mixed and multi-output data face enormous challenges.It is crucial to effectively use the characteristics of mixed and multi-output data to calculate the distance or similarity between samples.This paper focuses on the requirements of the task of mixed and multi-output data classification.It studies along mixed data and multi-output data measurement learning methods.Its main work and innovation are as follows.(1)we propose a geometric mean metric learning method for heterogeneous data.The numerical data and categorical data are mapped to the reproducing kernel Hilbert space by using different kernel functions,t hus avoiding the negative influence of the high dimensionality of the feature.At the same time,we propose a multiple kernel metric learning model based on geometric mean,which transforms the metric learning problem of heterogeneous data into solving the midpoint between two points on the Riemannian manifold.To avoid overfitting,the optimization objective is regularized by symmetrized LogDet divergence.MKGMML is very efficient in that there is a closedform solution for each distance metric.The algorithm is superior to existing metric learning methods in both precision and efficiency.(2)we propose a support vector heterogeneous metric learning framework for mixed numerical and categorical data.Almost all exiting works focus on defining new distance metrics rather than learning discriminative metrics for mixed data.A heterogeneous sample pair kernel is defined for mixed data and metric learning is then converted to a sample pair classification problem.The proposed model can be efficiently solved by standard support vector machine solvers.To take the importance of numerical and categorical data into account,a multiple kernel learning model is developed to learn a weighted metric for mixed data.Experiments on benchmark mixed data validates the superior performance of the proposed metric learning model.(3)we propose a novel relation alignment metric learning(RAML)formulation.Most existing metric learning methods focus on learning a similarity or distance measure relying on similar and dissimilar relations between sample pairs.However,pairs of samples cannot be simply identified as similar or dissimilar in many realworld applications,e.g.,multi-label learning,label distribution learning and tasks with continuous decision values.Since the relation of two samples can be measured by the difference degree of the decision values,motivated by the consistency of the sample relations in the feature space and decision space,our proposed RAML utilizes the sample relations in the decision space to guide the metric learning in the feature space.In this way,our RAML method formulates metric learning as a kernel regression problem,which can be efficiently optimized by the standard regression solvers.We carry out several experiments on the single-label classification,multi-label classification,and label distribution learning tasks,to demonstrate that our method achieves favorable performance against the state-of-the-art methods.
Keywords/Search Tags:metric learnig, kernel regression, geometric mean, support vector machine, heterogeneous data, multi-output data
PDF Full Text Request
Related items