Font Size: a A A

Deep Ranking For Zero-Shot Multi-Label Image Classification

Posted on:2019-11-01Degree:MasterType:Thesis
Country:ChinaCandidate:H H LiFull Text:PDF
GTID:2428330623962487Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In the field of image classification,in order to accurately identify a certain category,the traditional image classification system must first obtain the labeled training samples of the corresponding categories,then some of the them are sampled as the training set to construct the classifier,and finally test on new samples for classification.However,in the practical situation,the number of target types is huge,and it is expensive to obtain the label samples of all categories.Conversely,when there are a large number of pictures to be classified,it is not realistic to collect corresponding labels for each image.Traditional classification methods may not work.To solve the above issues,earlier researches have presented models that use some auxiliary sources,such as text data,to help recognize object categories unseen during training,known as Zero-Shot Learning(ZSL),which is motivated by human's ability to identify new objects only with some description.However,most of these models are limited to images with a single label.However,most images belong to more than one object category,and therefore correspond more than one label.Zero-shot mulit-label image classification is a crosslearning task which faces challenges of both zero-shot learning and mulit-label classification.Combining the characteristics of the two learning tasks,this thesis defines the learning problem and implements a model which can classify unseen categories for multi-labeled images.On the basis of this,a unified algorithm framework of zero-shot multi-label classification based on deep ranking is proposed,which contains three parts: feature extraction,cross-modal mapping and multi-label classification.In particular,a Deep Embedding Model(DEM)is proposed to implement multimodal mapping between visual features and semantic features,and knowledge transfer from known to unknown categories is realized by means of intermediate distributed semantic representation.At the same time,two kinds of classification algorithms,Transductive Multi-label Prediction(TraMP)and Direct multi-label Ranking SVM(DRankSVM),are proposed to realize the final classification.Specifically,to achieve classification,the TraMP algorithm constructs the manifold structure between the training sample and the test sample,which belongs to the transductive learning.The DRankSVM algorithm uses the pairwise ranking information between labels to constrain the classification to achieve the final classification,which belongs to the inductive learning.Secondly,a novel classification algorithm based on deep multi instance is proposed.At this point,the sample feature is no longer single dimensional,but segmented into several blocks.A single example corresponding to multi-lable classification is converted into multi examples to multi-lable learning.In this pater,a step-by-step implementation of Instance Differentiation(InsDif)algorithm is proposed to achieve the final classification.Specifically,the classification process is divided into two steps.First,the sample feature is divided into multiple instances.Then the instances are divided into different clusters based on the clustering algorithm.Second,the correspondences between the cluster centers and the labels are explored to achieve the final classification.Finally,a large number of experiments have been carried out on three mainstream multi-label datasets,Natural Scene,IAPRTC-12 and NUS-WIDE,to verify the effectiveness and advancement of the two algorithms.
Keywords/Search Tags:Zero-Shot Learning, Multi-Label Classification, Cross-Modal Mapping, Multi-Instance Learning, Learning to Rank
PDF Full Text Request
Related items