Font Size: a A A

Superpixel Deep Network Based RGB-D Indoor Scene Semantic Segmentation

Posted on:2020-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:J L LuFull Text:PDF
GTID:2428330623456734Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As a necessary research in computer vision,indoor scene semantic segmentation is core task for understanding indoor scene,which has also been research hotspot and difficult problem in the field of image processing.Indoor scene semantic segmentation assigns semantic label to each pixel in the scene image.During the process,traditional problems such as detection,segmentation and multi-label recognition are combined together,so semantic segmentation is a challenging work.The semantic segmentation results provide object-level semantic understanding of scenes,which is of great value for automatically understanding the environment,and has been widely used in robot vision,security monitoring,firefighting and other scenes.Deep networks have been proved as a powerful method for extracting image features in recent years.In semantic segmentation of indoor scene,deep networks are often used to extract image pixel features.End-to-end indoor scene semantic segmentation methods use deep networks to reason semantic labels of pixels base on the feature maps of top layer,and the semantic labels are transferred to input image by means of up-sampling.However,because the resolution of high-level feature maps is much lower than that of input image,the semantic segmentation boundary is not distinct.In addition,taking an image with size of 480×640 as example,the number of pixels exceeds 300000,which means number of computing units for deep networks is too huge.Superpixel is a collection of adjacent pixels with similar features such as color,brightness and texture,which retains most of the necessary structural information for segmentation.If an image is represented as a collection of superpixels(usually less than 1000 superpixels are included in an image),number of computing units is significantly reduced while superpixel is considered as computation unit,which is helpful to reduce the computation complexity.For this reason,this paper considers taking superpixel as input of deep network to alleviate blur of the semantic segmentation boundary and the heavy computation complexity.(1)Propose a new kind of superpixel deep network SuperPixelNet.The network takes superpixel and its neighborhood superpixels as input to learn multi-modal features of the superpixel,and integrates local features and global features to finish superpixel-level RGB-D indoor scene semantic segmentation.(2)Propose a new kind of superpixel deep network RCN(Region Classification Net).The network consists of two subnetworks,respectively taking RGB image and HHA image as input to extract color features and depth features.Multi-level feature representation of superpixel is obtained by combining the bounding box of superpixel and the feature maps.Based on the multi-level feature representation,superpixel isclassified and superpixel-level indoor scene semantic segmentation is realized.Experimental results on open indoor scene data set NYU Depth V1 and NYU Depth V2 of the show that the proposed methods have a good semantic segmentation performance...
Keywords/Search Tags:RGB-D Indoor scene, semantic segmentation, superpixel, deep networks
PDF Full Text Request
Related items