Font Size: a A A

Discriminative Metric Learning For Multi-modality Data

Posted on:2020-03-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:J Q LiangFull Text:PDF
GTID:1488306131467634Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the fast development of the Internet and social networks,multi-modality data represented by text,image and video are constantly emerging.Multi-modality data presents the characteristics of large scale,multiple types and high dimensionality.Traditional constructive metrics can not accurately measure the difference of samples,and metric learning for single modality data in the past can not be directly applied to multi-modal tasks.The thesis focuses on the recognition and retrieval tasks of semi-supervised high-dimensional multi-modality data,studies models and methods of multi-modal metric learning.The main research results are as follows:(1)To avoid the curse of dimensionality in high dimensional feature space,we propose a novel method called Efficient Multi-modal Geometric Mean Metric Learning(EMGMML).The method distinguishes importance of different modalities by introducing weights.Meanwhile,it uses Log Det divergence to ensure the correlation between different modalities.The method has closed form solution on each subspace of Riemannian manifold.It improves the performance and efficiency of multi-modal metric learning.(2)To solve the small sample size problem,we come up with a novel method called Weighted Graph Embedding-Based Metric Learning(WGEML).The method constructs intrinsic graphs and penalty graphs to fully exploit both the consistency and complementarity among multiple subspaces.To better address the problem of nonlinear separability,the method is further extended to the kernel space,which enhances the accuracy of facial-image based kinship verification.(3)To cope with the problem of limited labeled samples,we establish a Semisupervised Laplace Regularized Multi-modal Metric Learning(SLRMML)method.The method constructs a semi-supervised Laplacian graph to maintain the consistency of feature space distribution and category label distribution,and improves the performance of information retrieval and nearest neighbor classification.(4)For the usage of a large number of samples lead to the risk of performance degradation,we develop a Semi-Supervised Online Multi-Kernel Similarity(SSOMKS)learning framework with active sampling strategy.In this framework,the reliability of classification is evaluated on the margin concept,and the unlabeled samples with higher confidence are selected for training.Better performance is obtained with a few labeled samples.In conclusion,this thesis focuses on the challenges faced by multi-modal metric learning in the context of big data,and proposes learning models and solutions from different levels of feature dimensions,category tags and data scale.These solutions provide technical support for the related applications of metric learning in data mining,computer vision and pattern recognition.
Keywords/Search Tags:Multi-modality, metric learning, geometric mean, graph embedding, Laplace regularized, active sampling
PDF Full Text Request
Related items