Font Size: a A A

Research On Scene Recognition Algorithm Based On Deep Learning

Posted on:2021-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:T WangFull Text:PDF
GTID:2428330620464036Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the emergence of large-scale data sets,specialized hardware,and new algorithms,deep learning,especially deep convolutional neural network,has achieved the semantic classification capability close to human level in the field of image recognition.In the context of deep learning,scene recognition aims to deduce the scene or place of the object in a given image,train the convolutional neural network on the scene data set to learn the representation pattern of the scene image,and then make a highly accurate generalization of the image from the semantic level.There are three main problems in the application of deep learning in scene recognition.First,the accuracy of the network in scene recognition is inferior to other visual tasks such as image classification;Second,the network is limited by the spatial representation of the center of gravity of the training data and ignores the context information in the scene;Third,the accuracy of the network through increasing capacity does not increase linearly with the increase of network capacity.This thesis is based on deep learning scene recognition algorithm as the research topic,focusing on the impact of network lightweight and object semantic features on scene recognition,the main research content is divided into three parts.Firstly,this thesis analyzes and compares several classical convolutional neural networks in terms of the number of network parameters,model size,network depth,recognition accuracy and so on.By using channel separation convolution to replace standard convolution and readjust within the network,the validity of the scheme was evaluated on the place20-RGB scene data set.The experimental results show that the network lightweight significantly reduces the number of parameters in the scene recognition network and improves the scene recognition accuracy.Then this thesis studies the influence of semantic data set on scene recognition from another angle.Using the semantic segmentation network to classify color RGB images at pixel level,semantic segmentation is applied to the place20-RGB scene data set to obtain the corresponding place20-semantic data set.The semantic relation extraction network is used to train on this data set for scene recognition.The experimental results show that the accuracy of the network based on semantic data set training independently is significantly lower than that of the network based on colorRGB image training,but the semantic data set can provide a complementary object semantic information as an additional feature of the scene recognition.At the end of this thesis,a multi-modal deep learning architecture is studied,which uses a two-branch network to extract the image information and object context information of the scene by combining RGB branch and semantic branch.The attention mechanism formed in the training process is used to strengthen the learning of relevant context information.Based on the multi-modal deep learning architecture,this thesis proposes a method to enlarge the range of attention.By improving the attention module of semantic branch in the existing architecture,this method guides the attention of the network with the space and channel relationship provided,and strengthens the formation of attention of semantic feature source.The two-branched multi-modal scene recognition network based on semantic attention is formed after combining with the lightweight improvement to further improve the feature representation of scene recognition.Experimental results show that the network achieves better recognition accuracy on the place20-RGB scene data set.
Keywords/Search Tags:deep learning, convolutional neural networks, scene recognition, attentional mechanisms
PDF Full Text Request
Related items