Font Size: a A A

Research On Semantic Mapping Of Indoor Environment Based On RGB-D Camera

Posted on:2021-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:D J ZouFull Text:PDF
GTID:2428330605976846Subject:Control engineering
Abstract/Summary:PDF Full Text Request
For mobile robots to complete various tasks in complex indoor scenes,they should be able to achieve their own precise position and build three-dimensional maps of the environment for semantic perception.Simultaneous Localization and Mapping(SLAM)is a key technology for robots to autonomously locate and construct an environmental map.The mobile robot gradually constructs an environmental map through sensors to obtain environmental information and simultaneously estimates its position.When the sensor is a camera,it is called visual SLAM.The visual SLAM can accurately reconstruct the environment explored by the mobile robot,but cannot extract the high-level semantic information of the scene content,which cannot meet the needs of the robot to perform advanced tasks.With the continuous development of deep learning,computer vision and other technologies,the semantic information in the environment can be obtained through the target detection algorithm based on Convolutional Neural Network(CNN),which provides the necessary conditions for improving the intelligent level of mobile robots.Therefore,this thesis combines the visual SLAM algorithm with the object detection algorithm based on convolutional neural network to realize the three-dimensional semantic map construction of indoor environment.The main work of this thesis is the following three aspects:First,for the requirements of mobile robot autonomous localization and environmental mapping,the visual SLAM system is deeply studied,including the mathematical description of visual SLAM,camera model and spatial coordinate system,and various modules in the visual SLAM framework.The feature extraction and matching and 3D point cloud stitching experiments are carried out for the visual odometry at the front end of the system,and good feature matching and point cloud stitching results are obtained.Secondly,aiming at the requirements of mobile robots to realize environmental semantic perception,the principles and structures of neural networks and convolutional neural networks are studied.The YOLOv3 network model is selected as the object detection algorithm model to achieve object semantics in the environment.Besides,a pose optimization method based on semantic information is proposed.The designed experiment uses datasets and office environment images to test the object detection algorithm.The detection results show that the algorithm achieves a good balance between detection accuracy and detection rate,and maintains high accuracy while meeting the real-time performance of object detection.The pose estimation algorithm proposed in this thesis is evaluated by experiments.The localization and pose estimation accuracy meets the performance requirements of mobile robots.Thirdly,aiming at the problem of semantic map construction,the visual SLAM system and real-time object detection algorithm are combined,and the improved ORB-SLAM2 algorithm is used as the basis to construct the environment's three-dimensional map.The YOLOv3 object detection algorithm semantically perceives the environment,adding semantic information to the target space points in the 3D map through data association and model update,and constructs a three-dimensional map of semantic annotation.The data collected by NYU dataset and Kinect v2 camera is used to experiment and analyze the semantic map construction method to generate 3D point cloud maps and semantic maps,verifying the feasibility of the algorithm.In summary,the research content of this essay covers the visual SLAM system based on the RGB-D camera and the target detection algorithm based on the convolutional neural network.The pose estimation algorithm based on semantic information and the three-dimensional map method of semantic annotation are obtained.The experimental results with good results laid the foundation for subsequent mobile robots to perform complex tasks such as human-robot interaction.
Keywords/Search Tags:Visual SLAM, Object Detection, Pose Estimation, Semantic Mapping
PDF Full Text Request
Related items