Font Size: a A A

Intelligence Perception And Understanding Of Environment Based On Machine Vision

Posted on:2020-07-22Degree:MasterType:Thesis
Country:ChinaCandidate:Q J TangFull Text:PDF
GTID:2518306050456984Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Intelligent perception and understanding of unknown complex environments is an important topic in the field of robotics and computer vision.At present,the Simultaneously Localization And Mapping(SLAM)system accomplishes its autonomous positioning and three-dimensional reconstruction of the surrounding environment by using sensors.On the other hand,the semantic segmentation method based on deep convolutional neural network achieves high-precision pixel-level classification of images.However,the SLAM system can only obtain the geometric information of the environment,and the semantic segmentation system can only obtain the two-dimensional semantic information of the image.Therefore,in this topic,by combining the environmental geometric information and the image two-dimensional semantic information,the three-dimensional semantic map of the environment is constructed to complete the intelligent perception and understanding of the unknown environment.Firstly,the pose estimation and optimization of the camera is completed by a Vision-based SLAM system based on RGB-D camera.The visual SLAM system estimates the camera pose by using the geometric relationship of the matched feature point pairs by extracting and matching the features of the adjacent two frames,and uses the bundle adjustment algorithm to optimize the camera pose and feature point position in the local map.In addition,the loopback detection algorithm based on the word bag model determines whether the camera motion trajectory has a loopback.When the loopback is detected,the loopback is added to the pose map for global map optimization,thereby eliminating the cumulative error and obtaining an optimized Camera pose.Secondly,a semantic segmentation system based on deep convolutional neural network is designed to obtain two-dimensional semantic information in images.The semantic segmentation system is based on Google's Deep Lab V3+ algorithm,which combines the encoding-decoding method with the integrated context information to optimize the performance of the segmentation system.On the other hand,the combination of hole convolution and depth separable convolution reduces computational and storage resources while expanding the receptive field.Finally,the geometric information obtained by the visual SLAM system is correlated with the two-dimensional semantic information obtained by the semantic segmentation system to obtain the object-level semantic information in the three-dimensional environment,and the object model is continuously updated or created along with the motion of the RGB-D camera.Construct a globally consistent semantic map in a 3D scene,and ultimately achieve intelligent perception and understanding of the environment.In order to save storage space and better express the map,the semantic point cloud map is transformed into a semantic octree map.
Keywords/Search Tags:Scene Perception, Scene Understanding, Visual SLAM, Semantic Segmentation, Semantic Map
PDF Full Text Request
Related items