| With the rapid development of the Internet and the improvement of computer computing and storage capabilities,there are more and more users of smart phones.Everyday users share images on the Internet.When a user wants to search for an image in a complex scene,but does not know how to describe it,the retrieval of hand-drawn sketches in a complex scene is of great significance.This thesis focuses on the visual retrieval of hand-drawn sketches in complex scenes,semantic understanding of sketches in complex scenes,short text matching of sketch generation description sentences,and cross-domain retrieval of sketch images.The main research work completed and the innovative results obtained in this thesis can be summarized in the following three aspects:First of all,this thesis defines a novel problem,which is to understand sketches in complex scenes from a semantic point of view.In response to this problem,the Sketch Caption,a dataset for semantic understanding of sketches in complex scenes,was proposed and constructed.It was compared with a dataset of edge structures extracted by various methods,and proved the superiority of the hand-drawn dataset.And based on the CNN-RNN image to generate a text description model,to study the effect of different CNN structures in the generation of sketch text.By introducing the Attention mechanism,improving the application of attention mechanism in sketch comprehension according to the characteristics of sketches and limiting the length of the input sentence,a better text generation effect is achieved.Secondly,this thesis studies the matching algorithm of short text,which is divided into traditional algorithm and deep learning algorithm.Research shows that the Skip-Thought algorithm is more suitable for short text matching between sketches and images.The improvement based on the Word2vec method proves that each part of speech has different importance in the text matching problem of sketch and image generation.Finally,the cross-modal subspace retrieval of sketches and images is studied,and redundant dimensionality reduction experiments are carried out on the basis of multiple cross-modal subspace retrieval methods.Combining the information of semantic retrieval,a joint retrieval algorithm across modal subspace and semantic space is proposed.Using this joint retrieval method,a large number of experiments have been carried out on the sketch data set Sketch Caption in complex scenes,and the experiments have proved our cross-modality The joint retrieval method of subspace and semantic space effectively improves retrieval accuracy.And according to the proposed algorithm,a sketch retrieval system is designed and implemented. |