Font Size: a A A

Research On Freehand Sketch Semantic Parsing And Recognition Based On Deep Neural Networks

Posted on:2021-04-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Y ZhuFull Text:PDF
GTID:1488306122479874Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a simple and efficient way of expression,freehand sketching plays an important role in the long development of human society,which facilitates people's communication and information transmission.In modern society,freehand sketching is widely used in the field of design and creation,such as cartoon animation,architectural design,and costume design.However,after people create a sketch,existing computer-aided design systems require people to continue the manual semantic annotation operations on the sketch,because computers cannot accurately interpret the semantic information expressed in the sketch.How to use computers to automatically and efficiently carry out accurate semantic parsing and recognition on freehand sketches,so as to improve the work efficiency of relevant practitioners,is a highly meaningful and challenging frontier topic.The semantic parsing and recognition tasks of freehand sketches are faced with two major challenges: fewer features and fewer samples.(1)Compared with real natural images,freehand sketches are only composed of strokes with different lengths,and does not contain texture and color information.Moreover,freehand sketches are abstract and diverse,so it is difficult to accurately represent freehand sketches with the traditional hand-crafted features.(2)People only need to press camera buttons to obtain natural image data,while the freehand sketch is created by people's manual strokes,so it is more difficult to collect sketch data than natural images.Therefore,the existing sketch datasets are small datasets.Aiming at these challenges,this dissertation researches freehand sketch semantic parsing and recognition based on data representation,freehand sketch characteristics and deep neural network structure.The main research contents and innovative results of this dissertation include the following four aspects:Firstly,this dissertation proposes a method of double neural networks for freehand sketch parsing.An existing study has shown that large-size convolution kernels are suitable for extracting features from freehand sketch,but this study focuses on the recognition task of freehand sketches.In contrast,the semantic parsing task of freehand sketches is to predict the label of each stroke,rather than the label of the whole sketch.Large-size convolution kernels are not suitable for dealing with short strokes.For this reason,the two-branch network in this dissertation adopts large and small convolution kernels to deal with long and short strokes,respectively.In order to solve the ambiguity of stroke position in the input image,this dissertation proposes to fuse the minimum rectangle bounding box and stroke as the input image of the neural networks.The experimental results and analysis show that the proposed data fusion representation and structure of double neural networks can effectively improve the parsing accuracy.Secondly,this dissertation proposes a semantic parsing method for freehand sketches based on neural network and conditional random field.An existing study has shown that the relationship among strokes can improve the accuracy of semantic parsing of freehand sketches.However,this study only considers the spatial relationship between strokes and cannot guarantee the connectivity of probability graphs.Meanwhile,this study adopts the hand-crafted features with limited representation ability.In contrast,this dissertation proposes to construct a connected probabilistic graph model by simultaneously utilizing the spatial and temporal relations of strokes,and at the same time to use a convolutional neural network to learn the characteristics of the input image.In order to solve the limited information provided by stroke input images,this dissertation proposes to fuse stroke and sketch to form the input image,which can enhance the position and sketch information.Experimental results and analysis show that the proposed method is superior to other existing methods.Thirdly,this dissertation proposes a semantic parsing method for freehand sketches based on deep transfer learning.This dissertation uses abundant natural image data to pretrain a powerful convolutional neural network model,and then uses effective fine-tuning techniques to predict stroke labels.In order to improve the fine-tuning in the process of transfer learning,this dissertation proposes to add a grouping convolutional layer to the convolutional neural network,so as to enhance the representational ability of the convolutional neural network.Compared with other methods,the experimental results are improved by 9.7% on the abstract sketch dataset and 2% on the sketch dataset which can correspond to 3D meshes.Finally,a low resolution sketch image recognition method based on convolution of pixel(image)and point set is proposed.Sketch recognition using deep neural networks has become a new research trend.However,the traditional convolutional neural network based on pixels(image)has poor recognition performance for low-resolution sketch images due to the loss of image details.To solve this problem,a neural network based on joint pixel and point set convolution is proposed for low resolution sketch image recognition.The network is equipped with both image convolution and point set convolution,which can simultaneously process the representations of sketch image and point set.In addition,this dissertation proposes a hybrid classifier,a corresponding loss function,and a training strategy to better extract features for recognition.Experimental results show that our method is superior to other deep neural networks.
Keywords/Search Tags:Sketch Semantic Parsing, Sketch Recognition, Deep Neural Networks
PDF Full Text Request
Related items