Font Size: a A A

Indoor Scene Classification Based On Convolutioanl Auto-encoder And Dictionary Learning

Posted on:2017-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2348330509953982Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
As extension of scene classification, indoor scene classification is widely used in image/video retrieval, mobile service robot and other fields. It attracts a lot attention in computer vision area. Since indoor scene images face irrelevant interference, multi-scale/multi-view changes and similarity between categories problems. There is greater challenge in handling indoor scene classification than scene classification tasks.Based on summarizing the current scene classification methods, this thesis analyzes the difficulties indoor scene classification face; those cannot be handled with scene classification methods. They can be concluded to irrelevant information interference, multi-scale/multi-view changes and similarity between categories. Aiming at these problems, this thesis carries out research from image pre-processing, feature extraction and classifying. The main work of the thesis can be concluded as follows:In terms of image pre-processing, this thesis adopts an image pre-processing method based on visual attention mechanism, to extract regions of interest(ROI) of an indoor scene image. Firstly, acquire the feature maps of input images with the classical Itti model. Secondly, the acquired feature maps are fused on prior knowledge. This is achieved by top-down visual attention mechanism. In this procedure, several feature saliency maps of different attention value are obtained. Select the feature saliency map with the biggest attention value as the ROI map of the indoor scene image. The pre-processing operation conducted in this thesis can be able to effectively decrease the irrelevant information interference, which reduces calculation amount and improve classification results.For the sake of feature extraction, this thesis studies a feature extraction approach based on convolutional sparse auto-encoder(CSAE), which is conducted on the ROI maps. After selecting patches randomly from ROI maps, the single-layer SAE is trained by these patches, of which the connection weight matrix between the input layer and hidden layer is voted as the convlolution kernel of the CNN model. The convolution operation is conducted to extract complete and effective features of the indoor scene images, and the following mean-pooling is used to reduce the features' dimension. Different from current methods, features automatically learned by CSAE model are irrelevant to the displacement changes of the input images. This advantage of CSAE can solve the multi-scale/multi-view changes of the indoor scene images.In view of feature classification, this thesis researched a classification approach based on improved dictionary learning. After dictionary learning on the feature vectors learnt from CSAE model, a shared dictionary for all the categories and a category-specific dictionary for each category are obtained. To further increase the discriminative capability of leant dictionary, we add a within-class scatter constraint from the classical fisher discrimination criterion.To verify the validity of the proposed indoor scene classification approach, we conduct verification experiments based on some relevant data sets. Experimental results show that our method can efficiently overcome the problems of irrelevant information interference, multi-scale/multi-view changes and similarity between different categories. It achieves better classification accuracy than the existing state-of-the-art methods.
Keywords/Search Tags:Indoor scene classification, ROI learning, CSAE, dictionary learning
PDF Full Text Request
Related items