Font Size: a A A

Research On Single-view 3D Reconstruction And Visual SLAM Method Based On Deep Learning

Posted on:2024-03-27Degree:MasterType:Thesis
Country:ChinaCandidate:J NiFull Text:PDF
GTID:2568307154498974Subject:Master of Electronic Information (Professional Degree)
Abstract/Summary:PDF Full Text Request
The development of image-based 3D vision technology is among the most challenging research directions in the field of computer vision.In recent years,the advancement of deep learning technology has brought about new research ideas and inspirations that combine traditional geometric methods and semantic information.Among these,the most prominent applications have been observed in single-view 3D reconstruction and visual SLAM performance.Single-view 3D reconstruction is a challenging task due to the uncertainty involved in converting 2D images to 3D models.Data-driven deep learning techniques have been employed to reconstruct 3D models by utilizing prior knowledge of the 3D world.However,simple linear representations are inadequate for modeling complex objects and struggle with multi-object occlusion and reconstruction efficiency.Therefore,we propose a network framework based on encoders and decoders to address the challenges in single-view 3D reconstruction.The framework integrates local and global features to help the network model better understand the intricate details of complex objects.Additionally,a novel encoder module is proposed,leveraging a fusion of channel attention mechanism and residual convolution to enhance semantic discrimination between different objects and capture deeper correlation information among them,thereby addressing the issue of multiobject occlusion.Finally,a pixel-based backprojection ray decoding method is proposed to improve reconstruction efficiency.Compared to existing methods,the approach presented in this thesis is more accurate efficient,and adaptable to a wide range of object scenarios.Early visual SLAM systems relied on feature point matching,camera pose estimation,and bag-of-words descriptors for mapping and localization.Although this approach offered high localization accuracy,it had limited ability to understand complex scenes.Tracking and relocalization often failed when there were significant changes in the camera angle or environment.To address these limitations,we propose a semantic visual SLAM system that combines traditional visual SLAM techniques with 3D reconstruction and target detection.The system utilizes ellipsoid modeling to establish associations between 2D image feature points and 3D map points,and uses semantic information to guide camera pose optimization.Additionally,the system builds a semantic map based on object models,enabling relocalization across a wider range of camera views and improving map-based initialization tracking.By incorporating deep learning techniques for object detection and recognition,the proposed system achieves superior performance compared to traditional methods.
Keywords/Search Tags:Computer Vision, Deep Learning, Single View 3D Reconstruction, Visual SLAM
PDF Full Text Request
Related items