Font Size: a A A

Research On Semantic Segmentation Based On Spatial Depth Information And Cascaded CRFs

Posted on:2020-12-13Degree:MasterType:Thesis
Country:ChinaCandidate:R ShiFull Text:PDF
GTID:2428330602451889Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Image semantic segmentation is one of the most important research topics in the field of computer vision.Its goal is to assign semantic labels to each pixel of an image,so that the color image can be transformed into a semantically labeled image.Although the development of deep learning improves semantic segmentation significantly,there are still some problems.In complex scenes,due to the different shooting angles and uneven illumination,some problems such as the overlapping of different targets and the ambiguity of low-level visual feature exist.So,some semantic confusion problems often arise because of the similarity of object appearance features.In addition,the down-sampling operations in convolutional neural networks discard a large amount of image features,so the context between objects is blurred and the edge information is not clear in semantic segmentation results.Therefore,in order to enhance the model's ability to distinguish objects with similar features and to locate object boundaries,two effective semantic segmentation methods are proposed in this paper.One is the semantic segmentation method based on encoder-decoder with spatial depth information.And another is the semantic segmentation method based on encoder-decoder with cascaded conditional random fields.These two methods are described in detail as follows:(1)Considering the characteristics of depth images,a semantic segmentation method based on spatial information is proposed,which mainly adds depth map,i.e.scene spatial information to features learned from RGB images,so as to alleviate the problem that the model confuse the objects with similar features.Inspired by the advantages of encoderdecoder model and spatial pyramid structure,we first establish an encoder-decoder model based on spatial pyramid pooling architecture,named Basic Net.Based on this model,spatial depth information is introduced,and a two-branch semantics segmentation model based on RGB-D image is established,which two branches are used to learn RGB features and spatial depth features from RGB images and depth images respectively.In order to make the model get as much spatial depth information as possible while learning semantic information,the features of the two branches are fused several times,and then the different local features are extracted by spatial pyramid pooling,so as to distinguish overlapping and confusing objects from each other by spatial depth information.Finally,the effectiveness of the proposed method is verified by a number of comparative experiments.(2)Although some semantic segmentation methods use conditional random fields to obtain boundary information,they usually only deal with the final output of the model.In this paper,a semantic segmentation model based on cascaded conditional random fields is established to learn boundary information from different levels of the model and enhance the ability of the model to locate the object boundary.Inspired by the skip connection in FCN and the good boundary location ability of conditional random fields,a cascaded CRFs module is designed and introduced into the decoding stage of Basic Net.Specifically,the output of decoders in Basic Net is processed by conditional random fields,and the output of the current conditional random fields is taken as the input of the next conditional random fields to form a cascade relationship.With the continuous cascade of conditional random fields,not only the deep and shallow features of the image are supplemented layer by layer,but also the boundary contour of the object is located more accurately.In order to further supplement the semantic information of the image,the output of the cascaded CRFs is fused with the output of the last decoder,so that the model can enhance the ability of locating the object boundary and get more accurate semantics segmentation results.Finally,a number of experiments on different data sets show that this method enhances the model's ability to locate target boundary information.
Keywords/Search Tags:Semantic Segmentation, Encoder-Decoder Model, Spatial Pyramid Pooling, Spatial Depth Information, Conditional Random Fields
PDF Full Text Request
Related items