Font Size: a A A

Research Of Fusing Object Semantics And Appearance Deep Features For Scene Recognition

Posted on:2020-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:W L LiFull Text:PDF
GTID:2428330590995820Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
As one of the key technologies,scene recognition technology has become one of the important research issues in the field of machine learning.And it has developed into one of the important research issues in the field of deep learning.At the same time,it is also a research and technology hotspot in the field of image recognition.If we can effectively improve the performance of the scene recognition,we will greatly promote the development of human-computer interaction,image retrieval,video retrieval,intelligent video surveillance and other fields,and achieve great economic benefits.Therefore,the study of scene recognition is very important and challenging.Scene images generally have illumination angle,illumination intensity,shape change,partial occlusion and background mixing,which devotes to that the scene images show the characteristics of large intra-class variety and high inter-class similarity.Many efforts had been made over recent decades to find effective representations that will improve scene recognition performance.In the early work,it was mainly to use the prior knowledge to design the handcrafted low-level visual features for scene image representation.These low-level features are able to achieve convincing results on the simple scene image databases.However,for those more challenging large-scale scene recognition databases,the recognition effect is not good.More recently,deep learning theory has achieved a tremendous success in the computer field.Compared to shallow learning,deep learning models always have better results.It is found that the object attribute of the scene image improves the recognition rate,and the object semantic feature of the scene image belongs to the high-level feature.Based on the previous work,we study a deep comprehensive representation of the scene image.The work of this paper is as follows:(1)This paper introduces the research background and significance of scene recognition and analyzes the research status of scene recognition by reading a large number of domestic and foreign literatures.From both shallow and deep features,this paper detailed analysis the research status of scene recognition at home and abroad.And introduced some common scene datasets;(2)This paper investigates the traditional shallow feature extraction algorithm,and comparatively analyzes global features,local features,and object attribute features.The typical representative algorithms of these three characteristics are introduced in detail.A lot of experiments are carried out on the OT database and MIT67 database for Gist algorithm,SIFT algorithm and OB algorithm,and comparative analysis was made;(3)This paper present a comprehensive representation for scene recognition by fusing deep features extracted from multiple discriminative views including the information of object semantics,global appearance and contextual appearance.These views show diversity and complementarity of features.The object semantics representation of the scene image,denoted by Spatial-layout-maintained Object Semantics Features(SOSF),is extracted from the output of a deep-learning-based multi-classes detector by using Spatial Fisher Vectors(SFV),which can simultaneously encode the category and layout information of objects.A multi-direction Long Short-Term Memory(LSTM)-based model is built to represent contextual information of the scene image,and,the activation of the fully connected layer of a Convolutional Neural Network(CNN)is used to represent the global appearance of scene image.These three kinds of deep features are then fused to draw a final conclusion for scene recognition;(4)Finally,extensive experiments are conducted to evaluate the proposed comprehensive representation on three benchmarks scene image database.The proposed method can achieve scene recognition accuracy of 89.51% on MIT67 database,78.93% on SUN397 database,and 57.27% on Places365 database,respectively,which are better percentages than the accuracies obtained by the latest reported deep-learning-based scene recognition methods;In summary,the algorithm provides a high-precision scene recognition method,which can be classed as a multi-view learning technique,and uses different views of deep features to implement scene classification.Has a high practical value and development prospects.
Keywords/Search Tags:depth feature, object semantics, context feature, comprehensive representation, scene recognition
PDF Full Text Request
Related items