Font Size: a A A

Reconstrucation Regularized Deep Metric Learning For Multi-lable Image Classification

Posted on:2021-04-01Degree:MasterType:Thesis
Country:ChinaCandidate:C LiuFull Text:PDF
GTID:2428330623467764Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a fundamental task in computer vision,multi-label image classification is widely used to predict categories of multiple objects existing in a single image.A straightfor-ward method is to treat each object in the image independently,so that multi-label image classification can be achieved by casting this task into several binary-class subproblems,where each subproblem predicts whether an image contains objects of the corresponding categorical label.However,such a simplified method neglects the correlations among the semantic information of different labels,as well as the visual features of different objects in images,which is also true in many existing multi-label learning methods.This paper studies on the application of distance metric learning to multi-label image classification,which focuses on the effective utilization of metric learning for capturing correlation between image features and labels.Recently,deep learning techniques have achieved very promising results in various image classification applications.Different from existing multi-label learning methods,in this thesis,we propose a novel framework called Reconstruction regularized two-way Deep Distance Metric(RETDM)to utilize the useful correlation information between labels and image features.Specifically,we first attempt to learn an embedding subspace,where original images and labels are embedded via a Convolutional Neural Network(CNN)and a Deep Neural Network(DNN),respec-tively.Through these two networks,we learn both the image features and label features simultaneously,as well as discover the dependencies of those features upon each other.Moreover,a two-way distance metric learning approach is presented in order to capture the correlations between the learned image features and label features.Finally,a recon-struction network is incorporated into the framework as a regularization term to make the learned features more representative.Compared with state-of-the-art methods for the multi-label image classification task,the proposed framework has the following advantages:1.An end-to-end trainable framework is proposed to integrate comprehensive distance metric learning into deep learning for multi-label image classification.2.We present a two-way distance metric learning approach based on two different views to capture the correlations between images and labels,which is tailored for multi-label image classification.3.A reconstruction error-based loss function is introduced to regularize the label em-bedding space to further improve the model performance.We evaluate the proposed framework with extensive experiments on publicly avail-able multi-label image datasets(e.g.,Scene,Mirflickr and MS-COCO).The experimental results demonstrate that our framework achieves significantly better performance com-pared with the existing state-of-the-art baselines.
Keywords/Search Tags:Multi-lable image classification, Deep metric learning, reconstruction regularization
PDF Full Text Request
Related items