Cross-model Retrieval For Free-hand Sketch

Posted on:2021-04-22

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Xue

Full Text:PDF

GTID:2428330632962937

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

In recent years,the cross-modal retrieval task of free-hand sketch is a relatively popular research area.This paper mainly focuses on fine-grained cross-modal retrieval between free-hand sketches and natural pictures,i.e.,fine-grained sketch-based image retrieval(FG-SBIR)task.The query sketch is an abstract and ambiguous data,while the retrieval target is a common natural image dataset,thus there is domain gap between the two modal data.Therefore,the key problem of FG-SBIR are to build a bridge between two modal data to eliminate domain gap,i.e.,we need to extract visual features from sketches and natural images,and embed them into a common embedding space.Then,the two main challenges of FG-SBIR task are(1)the abstraction of sketches brings challenge when extracting effective visual feature information from sketches and natural pictures;(2)We need to build a common embedding space,which is suitable for cross-modal information.This paper is dedicated to solving the above two challenging problems,by analyzing the characteristics of sketches and the difficulties of cross-modal retrieval task,corresponding solutions are proposed.We propose a novel FG-SBIR model:(1)by analyzing the characteristics of sketch data,our paper finds that sketches are abstract and sparse.Aiming at these two characteristics,an attention mechanism is introduced to enable the model to extract more effective visual feature information;(2)Through further research,it was found that the existing model only focused on the features extracted from the final fully-connected(FC)layer of the model but ignored the features from the intermediate layer,which are full of low-level visual feature information.Therefore,our paper fuses the feature information on the middle layer and the final FC layer to build a common embedding space;(3)In order to obtain the feature information from the middle layer better,our paper proposes a multiple triplet ranking model,which introduces an auxiliary supervised loss function of the middle layer to obtain more effective feature information.Finally,our paper also proposes a novel distance metric to further improve the performance of our model.In this paper,extensive experiments are performed on three fine-grained sketch-image retrieval public datasets:QMUL-Shoe,QMUL-Chair and QMUL-Handbag.The experimental results show that the proposed method achieves better performance than the state-of-the-art methods.The results of comparative experiments have proved the effectiveness of each module in the model.

Keywords/Search Tags:

Cross-modal, Fine-grained retrieval, Free-hand sketch, Embedding space, Visual feature

PDF Full Text Request

Related items

1	Free-hand Sketch Based Visual Retrieval Study
2	Fine-Grained Sketch-Based Image Retrieval
3	Category Alignment Adversarial Learning And Fine-Grained Supplementary Feature Learning For Cross-modal Retrieval
4	Research On Object-scene Related Visual Features And Cross-modal Reciprocal Neighbors Based Image-sentence Retrieval
5	Research On A Fine-grained Cross-media Retrieval Method Based On Adversarial Networks
6	Cross-modal Feature Augmentation For Visual Semantic Understanding
7	The Fine-Grained Retrieval Of Sketches Based On Deep Learning And Related Research
8	Deep Learning-based Fine-grained Cross-media Retrieval
9	Fine-Grained Image Recognition Based On Feature Encoding
10	Research And Application On Fine-Grained Image Classification Based On Bilinear Model