Font Size: a A A

A Generative Adversarial Approach For Generalized Zero-Shot Sketch-Based Image Retrieval

Posted on:2022-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:J W ZhuFull Text:PDF
GTID:2518306728466174Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Cross-modal retrieval means that users query the data related to their semantic types in other modes by inputting the data of one media type,such as retrieving images based on text or video-based on images.Among them,sketch-based image retrieval is a new research field.Compared with the traditional text-based image retrieval,the sketch can provide more intuitive and accurate visual information.With the rise of touch-screen devices,this technology has more application scenarios in real life.At present,researchers combine zero-shot learning with sketch-based image retrieval and put forward a more challenging and practical research setting,zero-shot sketch-based image retrieval.In this task,the categories on the target retrieval domain are invisible in the training phase.At present,the solution to this problem is to transfer the cross-modal knowledge learned from the source domain to the target domain based on semantic side information.However,when retrieving images,if the target retrieval domain contains both invisible categories and partially visible categories in the training stage,the performance of these methods is poor.We call this problem a generalized zero-shot sketch-based image retrieval problem.This is because,in the process of deep learning,some semantic loss will occur when mapping from low-level visual information to high-level semantic information.In addition,in the training stage,the model can only learn the information of visible classes,so the model may overfit the visible classes in the training process.During the test,if the visible classes and invisible classes are very similar,the model tends to give the prediction of visible classes,resulting in detection errors.Obviously,the generalized zero-shot sketch-based image retrieval has greater retrieval difficulty.In order to solve the above problems,we propose a new generation confrontation network model based on dual learning to solve the generalized zero-shot sketch-based image retrieval.This model uses the generation model to map the sketch and natural image features to the same common subspace,and then remap the generated semantic features back to the source visual space through another generation model.This method can effectively ensure the cyclic consistency between different modes,reduce semantic loss and improve retrieval performance.Experiments on two public large-scale sketch natural image data sets show that the performance of our proposed model is better than the existing first advance methods in the traditional zero-shot sketch-based image retrieval problem and the generalized zero-shot sketch-based image retrieval.In the follow-up study,we found that in the zero-shot learning task,the catastrophic forgetting phenomenon in the pretrained process is one of the important reasons for the poor generalization ability of the model in the target domain.This problem usually occurs in the process of fine-tuning the model.Therefore,we also explore a visual feature learning model based on the optimization process of teachers and students.This model can retain the previous knowledge to the greatest extent and obtain more discriminative visual features in the process of fine-tuning.Experiments show that this visual feature learning model based on the teacher-student optimization process can also improve the effect of generalized zero-shot sketch-based image retrieval.At the same time,the model can be combined with the generalized zero-shot sketch-based image retrieval model based on a confrontation generation network to improve the retrieval performance of the model.
Keywords/Search Tags:Cross-modal Retrieval, Sketch-based Image Retrieval, Generalized Zero-shot Learning, Dual Learning, Generated Adversarial Network
PDF Full Text Request
Related items