Font Size: a A A

Research On 3D Semantic Surface Reconstruction Of Large Scale Scenes Based On RGB-D Video Sequence

Posted on:2019-12-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:J T DaiFull Text:PDF
GTID:1368330566985617Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
The 3D semantic surface reconstruction of the scene,which has high research value in the fields of augmented reality?unmanned driving and indoor robots,is to reconstruct the 3D semantic surface of the scene by the video image sequence captured by the camera.With the continuous development of deep learning,it has become possible to apply deep learning based semantic segmentation to the field of semantic surface reconstruction of the scene.This paper designes a three-dimensional semantic surface reconstruction system based on RGB-D camera and put focus on three directions of real-time tracking and optimization of camera pose,semantic segmentation based on single RGB-D image and three-dimensional semantic surface reconstruction of scenes.The main innovations and contributions are as follows:To ensure the uniform distribution of feature points in the image,this paper studies the adaptive ORB feature point extraction algorithm,which ensures that there are enough feature points in the overlapping area between last frame and current frame to estimate the pose of the camera.In this paper,an effective keyframe strategy is designed to guarantee the robustness of the tracking of camera pose and avoid the redundancy of keyframes.To improve the system's robustness of the tracking of camera pose under various camera motion conditions in indoor environment,this paper studies the feature point matching method based on optical flow tracking.This method has good robustness to the camera's fast rotation motion and back and forth motion.To improve the accuracy of semantic segmentation of convolutional neural networks,first,this paper proposes a spatial pyramid network module with identity shortcut to extract multi-scale information of images.Different scales of the parallel networks of the spatial pyramid network module are set through the different "holes" in the convolution kernel and the residual network module is designed to accelerate the training of the network through the identity connection structure.The pyramid module improves the accuracy of semantic segmentation network significantly.Secondly,this paper designes a RGB-D feature information multi-level fusion network module to integrate the texture information of the color image and the structure information of the depth image.Through the multi-level feature map fusion network,the network module fully integrates shallow features and deep features information of the color image and the depth image,which further improves the accuracy of semantic segmentation.Based on the TSDF spatial grid model,the 3D semantic surface reconstruction of large-scale scenes is performed,and according to the characteristics of semantic information,this paper studies the representation method,fusion method and semantic surface generation method of 3D semantic volume element.To realize the surface reconstruction of large-scale scenes,the system continuously shifts the TSDF spatial grid model along the camera's trajectory.At the same time,this paper also proposes an algorithm to fuse the 3D semantic model projection image with the single-frame semantic segmentation image,this algorithm not only improves the accuracy of singleframe semantic segmentation,but also improves the coherence and stability of semantic segmentation results between previous and subsequent frame images.Finally,this paper builds a semantic SLAM system experimental platform.This paper studies and designes real-time optimization methods for software systems,which improves the overall real-time performance of the system.On the TUM RGB-D dataset,the robustness of the semantic SLAM system designed in this paper for camera pose tracking is proved.On the NYUv2 dataset,this paper shows the improvement of the fusion semantic image over the semantic segmentation result of single-frame image.The performance of the system's three-dimensional semantic surface reconstruction for large-scale scenes is proved on large-scale sequences.
Keywords/Search Tags:Simultaneous Localization and Mapping, Feature Points, Semantic Segmentation, Spatial Pyramid, 3D Semantic Surface Reconstruction, Semantic Fusion, Semantic SLAM
PDF Full Text Request
Related items