Font Size: a A A

Deep Learning-based Sketch Generation,Recognition And Application

Posted on:2021-04-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Y ZhangFull Text:PDF
GTID:1368330614472215Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Sketches are very intuitive to humans and have long been used as an effective communicative tool.With the rapid emergence of portable touchscreen devices,such as the emerge of smartphones,tablets,sketchpad and even watches,sketches become much easier to obtain and are often a few finger sweeps away.Unlike conventional images containing color and texture,sketches are naturally more sparse and contain minimal details,while the object category to which it belongs is easily determined by humans,which suggests an inherent sparseness in the human neuro-visual representation of the object.Therefore,studying such sparse sketches can aid our understanding of the cognitive processes involved by human and spur the design of efficient visual classifiers.Furthermore,as a simple and powerful tool for communication,different people can identify the target and content of the interesting information quickly and correctly,which makes it easier to overcome barriers of culture,language,time and age,and thus exploring the feasibility of free-hand sketches can promote the dissemination of human knowledge and information and a good understanding of emotions behind the information.In this paper,we take the free-hand sketch as the main research object,and thoroughly study the application of sketch recognition based on image recognition of deep learning.The research content involves sketch generation and classification,sketch-based image retrieval,action recognition and other hot spots.Sketch recognition is the core of the above tasks,and the purpose of this paper is to get better sketch representation,which can contribute to sketch recognition accurately and efficiently.Therefore,the research on the sketch-related problem has an important theoretical significance and practical value.The contribution of this thesis is shown as follows:(1)Aiming at the shortage of sketch-specific model training data,the poor accuracy of sketch-based image retrieval and the shortcoming of insufficient generalization ability,we propose a multi-scale based sketch generation network.Firstly.we propose a multi-scale based convolutional neural network to generate rough sketch,which introduces multi-scale and multi-level strategy to extract the low-level and high-level feature and thus we can make use of the information in different levels.Secondly,we propose a sketch thinning method by measuring the match degree between rough sketch and refinement template.We refer to the hit transform theory in mathematical morphology and introduce a two-pass algorithm,which generates the final sketch by involving sums of weights.Finally,we propose to use thin plate splines for the non-rigid deformation of sketch,which can solve the style variation of sketch in the drawing process due to the effect of different culture background and painting level of drawer.The experimental results show that the proposed model has achieved satisfying result for the sketch generation problem in the public database,and it also solves the cross-domain problem in the sketch-based image retrieval task.(2)Aiming at the problem that most of convolutional neural networks treat the sketch representation as the natural image representation and they never consider the problem that the shape information has an effect for constructing a distinctive feature,we propose a novel two-branch neural network to extract feature.Firstly,we use traditional convolutional neural network to describe the appearance information;Subsequently,we propose to extract shape descriptor using point-set based neural network,which uses point-set-based sketch to represent the input sketch.In addition,we introduce the alignment network to learn an affine transformation,which aims to solve the invariance under shape rotation and translation of different sketch styles;Finally,we make a fusion of appearance and shape features followed by L1 normalization,and then we train SVM classifier to make the final prediction.The experimental results show that introducing the point-set-based representation in case of sketch lacking color and texture,we can mine more implicit shape features,which contributes to further improving the accuracy of sketch recognition and retrieval.(3)Aiming at the sketch characteristic that is naturally sparse and abstract,previous recognition methods based on deep neural networks combine appearance and shape feature.However,they do not consider the effect of the local feature and omit the mutual learning between different features.Therefore,we propose a mutual learning based end-to-end dual-recognition network.Specifically,we first propose multi-level feature fusion based convolutional neural network to extract appearance feature of sketch,which combines features from multiple low-level convolutional layer and the fully connected layer,and uses global average pooling to remain visually salient features and reduces feature dimension;Secondly,we propose graph convolution based neural network to extract shape feature of sketch,which uses k nearest neighbor to construct point-based graph and then extracts local feature to enforce the shape representation;Finally,we propose to optimize the two-branch recognition network using the mutual learning strategy,and introduce category consistency and attention consistency to constrain them.The experimental results show that the performance of our model is better than state-of-the-arts,and it can improve the accuracy of sketch recognition and sketch-based image retrieval.Moreover,the model can apply to other sketch style recognition task by using fine-tune strategy.(4)Aiming at the problem that when using different modalities based neural network for the action recognition in the video,there are no previous works considering the effect of shape information for performance improvement,we thus introduce sketch feature into the task and propose the mid-level based video sketch action network.Firstly,we propose a novelty attention guided sketch generation model,which extracts shape structure from the video to generate the original sketch.Meanwhile,we use the attention guided mechanism to refine the regions of the above sketch,which can remove the irrelevant regions and noise information.Therefore,we can generate discriminative regions related to action recognition;And then,we propose a novelty convolutional neural network based on original sketch and point-based sketch.The network selects the key frame as the input and extracts texture and shape information to predict the video action;Finally,we make a fusion for multiple modalities based action recognition score as the final result.The experimental results show that video sketch modality plays an important role and is very beneficial to the performance.
Keywords/Search Tags:Freehand Sketch, Object Category Recognition, Sketch-based Image Retrieval, Action Recognition, Mutual Learning, Deep Learning, Attention Mechanism, Deep Convolutional Neural Network, Machine Vision
PDF Full Text Request
Related items