Free-hand Sketch Based Visual Retrieval Study

Posted on:2020-06-03

Degree:Doctor

Type:Dissertation

Country:China

Candidate:P Xu

Full Text:PDF

GTID:1368330572472160

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

This thesis studies sketch-based visual retrieval,mainly including fast re-trieval for large-scale free-hand sketch,fine-grained sketch-based image re-trieval,fine-grained sketch-based video retrieval,etc.The contributions of this thesis can be summarized as follows.Firstly,the third chapter of this thesis defines a novel topic of fast re-trieval for large-scale free-hand sketch,and studies serveral intrinsic data traits of large-scale free-hand sketch using million-scale sketches as test-bed.The author proposes a deep hashing network,and it major innovations are:(1)A two-branch architecture of CNN and RNN is used to conduct the feature learn-ing and representation for sketch,utilizing CNN to extract abstract visual con-cepts and RNN to model human sketching temporal orders.(2)A novel sketch hashing loss is proposed that can suppress the impact of noise samples dur-ing network training,alleviating the intrinsic abstraction and noisy problem of large-scale sketches.This novel loss supervises the model to learn a feature space with category-level cohesiveness.Moreover,the proposed two-branch architecture also can be applied to large-scale sketch recognition.Another novel research problem is also defined in the third chapter,i.e.,zero-shot classifica-tion for large-scale sketch.The author proposes a deep embedding model to solve this challenging problem,which uses category-level semantic vector ex-tracted from edge-maps to conduct domain alignment.To obtain high-quality edge-map based semantic vectors,a large-scale edge-map dataset is collected covering 290,281 edge-maps and 345 categories.Secondly,the fourth chapter of this thesis explores the cross-modal sub-space learning for sketch-based image retrieval,and introduces a variety of clas-sical cross-modal subspace learning methods that have been successfully ap-plied to cross-modal matching between images and texts.Then,these methods are applied to the mutual retrieval between sketches and photos,and detailed experimental results and analysis are provided.Based on comparison experi-ments,the key elements that need to be considered in the process of cross-modal modeling for sketches and photos are discussed.At the same time,it also fully verified the application feasibility of the cross-modal subspace learning for the cross-modal matching between sketches and photos.The fifth chapter of this thesis defines a challenging problem:fine-grained sketch-based instance-level video retrieval,that is,a single sketch or a sequence of multiple sketches is used as a query to retrieve corresponding video in-stance.In this scenario,sketch contains both fine-grained visual appearance information and fine-grained motion information,and fine-grained motion tra-jectories are denoted by arrowed straight lines,curves,circles.To investigate this problem,the author collects the first fine-grained sketch-based video re-trieval dataset,containing 1448 sketches and 528 video clips with rich manual annotations.A multi-stream multi-modality neural network is proposed,which uses the idea of meta-learning to effectively solve the data scarcity problem of training samples,and has achieved good experimental results.The proposed network can be trained not only under the strong supervision training strategy,but also under the weak supervision training strategy based on multi-instance learning framework.

Keywords/Search Tags:

Free-Hand Sketch, Retrieval, Hashing Retrieval, Image Retrieval, Video Retrieval, Cross-Modal Retrieval

PDF Full Text Request

Related items

1	Cross-model Retrieval For Free-hand Sketch
2	Research On Single-modal And Cross-modal Retrieval By Hashing Technology
3	Research On Single-modal And Cross-modal Retrieval Technology Based On Hash Method
4	Instance-Aware Image Retrieval Technology Based On Multi-Task CNN
5	Research On Sketch-based Image Retrieval Via Deep Supervised Hashing
6	Research On The Rotational Invariant Deep Hashing Algorithm For Cross-Modal Retrieval
7	Cross-Modal Hashing For Efficient Multimedia Retrieval
8	Heterogeneous Graph Hashing For Cross-Modal Audio-Image Retrieval
9	Cross-modal Multimedia Information Retrieval
10	Research On Algorithm Of Deep Convolution Network And Feature Fusion For Cross Modal Commodity Retrieval