Font Size: a A A

Research On Indoor Scene Recognition And Labeling Based On Deep Learning

Posted on:2020-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:J J LiFull Text:PDF
GTID:2438330578954370Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the development of mobile handheld devices and network technology,scene images are generated through various channels,including a large number of untagged scene images.Scene annotation technology can directly reflect the scene category,which can help the user to quickly understand the image and help the manager to better manage the image.Therefore,scene annotation technology has very important research value.Scene annotation technology use tagged image data to train a mapping relationship model between semantic conceptual space and visual feature space.Apply this model to annotate the untagged images.Feature extraction is the key of scene annotation technology.The quality of traditional manual feature extraction is related to human experience,which ultimately leads to the lower accuracy of indoor scene labeling than outdoor scene labeling.To solve this problem,an indoor scene recognition and annotation algorithm based on in-depth learning is proposed.The algorithm is studied from two aspects: multi-layer feature fusion of extracting convolutional neural network and migration learning.In order to improve the expressive ability of features,an indoor scene classification algorithm is proposed,which based on multi-layer fusion features of convolutional neural network and Fisher classifier.Construct a 7-layer network model with input and output layers.Extract the features of convolution layer and pooling layer by using the ability of network feature extraction.Determine the feature fusion strategy and obtain the classification feature vectors by fusing the features of convolution layer and pooling layer.Finally,use Fisher classifier to complete the classification.Extracted five types of scenes from the MIT_Indoor database randomly and study the influence of the size and number of convolution kernels and the number of iterations on scene recognition rate.The results show that when the convolution kernel is10@5*5 and 15@11*11 and the number of iterations is 35,000,the recognition rate of the algorithm reaches 74%.Keep the sample and model parameters consistent,and design the comparative experiment,which one is based on PCA feature and Fisher classifier,and the other is based on CNN feature and Softmax classifier.The experimental results show that the classification accuracy based on CNN multi-layer fusion feature and Fisher classifier is 7%higher than that based on CNN feature and Softmax classifier,and 10% higher than that based on PCA feature and Fisher classifier when applied to 5 types of indoor scene classification tasks.Based on model transfer theory,an indoor scene classification algorithm based on AlexNet model transfer learning is proposed.Retain the AlexNet model convolution layer operation andparameters,and initialize the weight parameters of the migrated network with the weight parameters of the model.Replace the last full connection layer of the AlexNet model with the new full connection layer,Softmax layer,and output layer.Adjust the output layer dimension to67 according to the number of scene classification targets.Apply Softmax classifier to realize the classification and output the classification results.Using the MIT_Indoor indoor library,set the number of iterations to 15,000 and set the learning rate to 0.015.The final accuracy rate reached61.7%.Keep the sample and model parameters consistent,and design the comparative experiment,which compared with R-bow algorithm,BoP algorithm,ISPR algorithm,D-parts algorithm and GoogLeNet algorithm.The experimental results show that the classification rate of the indoor scene classification algorithm based on AlexNet model migration learning is 23.8%higher than that of the r-bow algorithm,15.6% higher than that of the BoP algorithm,11.6%higher than that of the ISPR algorithm,10.3% higher than that of the d-parts algorithm,and 2%higher than that of the migrated GoogLeNet model algorithm.The precondition of image labeling is image classification,which is to find the relationship between image vision and semantic text.Extract the deep features of scene images and use Softmax classifier to complete the classification.Apply the corresponding mapping relationship between classification results and annotated words to complete the annotation of indoor scenes.Using the MIT_Indoor scene library,the scene annotation accuracy reaches 61.7%,which effectively improves the indoor scene image annotation accuracy.
Keywords/Search Tags:image annotation, Deep Learning, the indoor scene image, Convolutional Neural Network, Transfer Learning
PDF Full Text Request
Related items