Font Size: a A A

Research On 3D Semantic Segmentation For Indoor Scenes

Posted on:2022-08-08Degree:MasterType:Thesis
Country:ChinaCandidate:Y X LiFull Text:PDF
GTID:2518306554950469Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Scene semantic segmentation is a very important research topic in the field of computer vision.It has been widely used in virtual reality,augmented reality,indoor navigation,automatic robot and other fields.The RGB image expresses the color and texture information of the scene,and the point cloud or depth image expresses the spatial structure information of the scene.How to fuse color texture features with spatial structure features and segment the fused features to improve the accuracy of semantic segmentation is of great research significance.In this thesis,two kinds of spatial structure information,namely RGB image and depth image or point cloud,are fused by deep learning method to improve the expression ability of features,and then semantic segmentation of the fused features is realized.(1)Different types of manual objects in indoor scenes have similar color and texture characteristics,and due to the limited indoor space,the objects are easy to block each other.The semantic segmentation based on the color and texture information of traditional RGB images tends to result in inaccurate segmentation boundaries or missegmentation.Aiming at this problem,depth image features and RGB image features can be fused,and details that are difficult to be accurately distinguished by color and texture features can be further distinguished by spatial structure features.In this paper,an image semantic segmentation network based on a double-branching encoder is designed to fully extract the color texture features of RGB images and the spatial structure features of depth images through two branches.In order to improve the stability of the features,an atrous spatial pyramid pooling model is used to capture the features of targets of different sizes after fusion.Different features are weighted by the convolutional block attention module to further strengthen the correlation between features.Finally,the semantic segmentation experiment is performed on Sun RGB-D public dataset.The results show that the fusion of color texture features and spatial structure features can further improve the semantic segmentation accuracy.(2)In the semantic segmentation task of point cloud,the three-dimensional coordinate information of point cloud provides rich spatial structure features,but lacks the color texture features of the scene.Aiming at this problem,under the same point cloud scene RGB color image texture feature and fusion with point cloud characteristics,can further enrich the feature information of scene.In this paper,firstly,a convolutional neural network is used as an encoder to extract the color texture features of RGB images in the same scene.The features of the obtained RGB map to the point cloud space.The color texture features extracted through multi-layer convolution can strengthen the correlation between features.Then the mapped features are fused with the spatial structure features of the point cloud to further provide rich color texture features.Finally,the self-attention mechanism is used to generate the global context information and establish the remote dependency within the feature.The fusion and segmentation task of point cloud features and RGB features was carried out in indoor scene dataset ScanNet.The experimental results show that the semantic segmentation accuracy of point cloud can be further improved after fusing the features of RGB images.
Keywords/Search Tags:Deep Learning, Semantic Segmentation, Neural network, RGB Images, Point cloud
PDF Full Text Request
Related items