Font Size: a A A

The Research Of Features For Image Scene Recognition

Posted on:2016-08-29Degree:DoctorType:Dissertation
Country:ChinaCandidate:M J ZangFull Text:PDF
GTID:1228330467495481Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Image scene recognition aims to automatically classify images according to thesimilarity of their scenes. While a human performs this task by deeply understandingimages and forming high-level concepts about them for classification, a computerclassifies images according to the similarity of the image data stored in digital storageformat. Therefore, one of the most important problems with automatic image scenerecognition is a “semantic gap” between “conceptual similarity” of the human and“digital storage similarity” of the computer. Extracting high level features for miningdeep information from images can efficiently narrow the “sematic gap”, and thusincreases the classification accuracy.In this paper, we focus on features for image scene recognition, and introduce thefollowing achievements:1. We propose a novel topic feature, Efficient Topic Feature (E-TF), for image sceneclassification. Most of the existing topic models need to make inference about latentvariables, which causes massive amount of computation when a new image isrepresented; they also involve labels in the modeling process, which causes thecoupling between features and the labels. For solving such problems, we propose anoval topic representation by employing latent variables and the learning method ofLDA (latent dirichlet allocation), then propose an efficient topic feature based on ourrepresentation. The proposed feature shares topics in different classes, and does notneed class labels in extraction, so it can avoid the coupling between features and labels.For representing a new image, our approach directly extracts its E-TF by codewordslinear mapping instead of the inference about latent variables and thus can reduce theamount of computation. We compared our method with three other topic models undersimilar experimental condition, as well as with pooling methods on the15-Scenesdataset. The results show that E-TF is capable of classifying the scene classes with ahigher accuracy.2. We propose a low-dimensionality object attribute feature (LD-OB). Objectattribute feature is a high-level image feature and has demonstrated its advantage inclassification. However, the dimensionality of OB feature is high, which causes massiveamount of computation in classification. Moreover, the existing methods of dimensionality reduction are incapable of simultaneously achieving both highclassification accuracy and significant dimensionality reduction. So we propose alow-dimensionality object attribute feature LD-OB by a pooling framework forsimplifying the patterns of object attribute feature to reduce the dimensionality, and bytwo strategies for obtaining more proper descriptors to improve the classificationperformance. We evaluated our approach on three real-world datasets, namely, an eventdataset UIUC-sports, a natural scene dataset LabelMe, and a mixture dataset15-Scene.The classification results show that our approach can not only get either similar orhigher accuracy, but also significantly reduce the dimensionality. The computationalcomplexity analysis shows that the method can reduce the time complexity ofclassification.3. We propose a middle level feature based on fast sparse coding (F-SC). Theexisting dictionary learning algorithms of sparse coding will cause massive amount ofcomputation because they need to iterate two solutions of convex optimization in bothdictionary updating step and codewords assignment step. We propose a sparse encodingthat performs a novel dictionary updating method to optimize dictionary learningalgorithm, and use it to extract our middle-level feature. F-SC firstly builds dictionaryby searching a set of representative samples. It then applies k-means++algorithm tofind initialized samples and performs k-mediods algorithm to search representativesamples. Finally it employs the searched samples to build the dictionary of sparsecoding. As our dictionary updating step is independent of codewords assignment stepand can avoid iterating two solutions of convex optimization, it can reduce thecomputational complexity. We modeled our feature by applying spatial pyramidmatching and compared our method with other spatial pyramid matching feature on theCaltech-101dataset,15-Scenes dataset, and UIUC-Sports dataset. The results show thatour feature is capable of efficient increase in accuracy.
Keywords/Search Tags:Image Scene Recognition, LDA (Latent Dirichlet Allocation), Topic Feature, Object Bank Feature, Sparse Coding, SPM (Spatial Pyramid Matching)
PDF Full Text Request
Related items