Font Size: a A A

Monocular SLAM And Dense Semantic Reconstruction Based On Depth And Semantic Information In Dynamic Scenes

Posted on:2022-09-26Degree:MasterType:Thesis
Country:ChinaCandidate:X R PangFull Text:PDF
GTID:2518306551953509Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Visual SLAM technology has been widely used in scenarios such as autonomous driving,AR and VR.However,most SLAM assumes that the scene is static,while most of the real scenes are dynamic,which will cause the mismatch of key points in the camera pose estimation process,thereby affecting its positioning accuracy.In addition,the maps constructed by traditional SLAM are often sparse and can only be used for positioning,but cannot be used for advanced tasks such as navigation and humancomputer interaction.In response to these problems,this paper proposes a new SLAM algorithm for dynamic scenarios based on traditional monocular SLAM with the depth information and semantic information predicted by the deep learning model,and constructs three-dimensional semantic dense map.The specific work completed in this paper is as follows:First,in order to provide depth information and semantic information to the SLAM system efficiently,this paper designs and implements a multi-task network that takes into account both accuracy and efficiency,which can simultaneously output depth information and semantic information.Experiments based on public data sets have verified the accuracy and real-time nature of the network.Secondly,for the processing of dynamic objects,this paper studies a dynamic point detection algorithm based on sparse optical flow,and combines semantic information to realize the detection and removal of dynamic points and dynamic objects.Then,combining the output information of the multi-task network and the dynamic object information obtained after dynamic object processing,a new SLAM algorithm is proposed.On the one hand,the SLAM algorithm uses the semantic segmentation map predicted by the model,combined with optical flow for dynamic object detection,only extracts candidate points from the static scene part and tracks them,and excludes the photometric residuals generated by the matching points in the dynamic area during optimization calculations;On the other hand,the depth map predicted by the model is used to initialize the key frame inverse depth,thereby improving the positioning accuracy of SLAM.Finally,the dense semantic mapping part,this paper completes the construction of the dense semantic map in the dynamic scene on the public data set,with the depth map and the semantic map outputted by the multi-task network,the camera trajectory outputted by the SLAM system and the detected dynamic object information.
Keywords/Search Tags:Monocular Visual SLAM, Dynamic Scenes, Deep Learning, Dense Semantic Reconstruction
PDF Full Text Request
Related items