Research On Improved Visual SLAM Technology Based On Fusion Of Multiple Neural Networks

Posted on:2023-04-25

Degree:Master

Type:Thesis

Country:China

Candidate:Y F Jin

Full Text:PDF

GTID:2558306629477374

Subject:Control engineering

Abstract/Summary:

PDF Full Text Request

In recent years,the information industry has developed rapidly.In this era of artificial intelligence boom,all walks of life have joined the research and development of unmanned equipment.Among them,intelligent robots play a very important role.Unmanned sweeping robots as small as indoors and unmanned vehicles as large as outdoor all need to rely on Simultaneous Localization and Mapping(SLAM)technology.Intelligent robots need to perceive their position and orientation in three-dimensional space in real time when moving,so as to ensure that they can accurately locate themselves.In addition,an environment map needs to be constructed to complete the tasks of path planning and intelligent obstacle avoidance.Compared with laser sensors,vision sensors have advantages in price and weight,so vision-based SLAM technology has attracted much attention.With the increasing number of SLAM application scenarios,the requirements for the accuracy and stability of the system are also getting higher and higher.With the continuous development of deep learning technology,many scholars have now started to study the visual SLAM scheme based on deep learning,hoping to obtain better accuracy and stronger robustness.At present,the visual SLAM technology based on deep learning has made many breakthroughs,especially in the feature extraction and matching of images,which has shown excellent performance.But in addition,there are still some problems to be solved in the existing technology.On the one hand,although visual SLAM based on monocular depth estimation can solve the shortcomings of many original SLAM solutions based on monocular devices,the accuracy of pose estimation and dense reconstruction is often insufficient due to the inaccurate estimated depth information.On the other hand,when the camera relocalization method based on deep learning is faced with complex scenes or images with noise,the relocalization accuracy drops sharply,and the system stability is not high.Aiming at the above problems,this paper deeply studies the improved visual SLAM scheme that integrates multiple neural networks,and proposes an indoor positioning and 3D reconstruction method based on monocular depth estimation,as well as camera relocalization method based on image denoising and CNN-LSTM(Convolutional Neural Networks and Long Short-Term Memory),in order to obtain higher accuracy pose estimation,better 3D reconstruction and relocalization effect.Overall,the main contributions of this paper are as follows:1)Aiming at the problem that the existing visual SLAM solutions based on monocular depth estimation are not accurate enough to accurately perform subsequent pose estimation and dense reconstruction,this paper proposes an indoor positioning and 3D reconstruction method based on monocular depth estimation.The monocular depth estimation model based on transfer learning and CNN can well predict the depth information of the image through the encoder-decoder structure.After obtaining accurate depth information,through the front-end ORB(Oriented Fast and Rotated Brief)feature extraction and matching and the back-end direct RGB-D BA(Bundle Adjustment)optimization to achieve indoor positioning and 3D reconstruction based on monocular equipment.The experimental comparison shows that the model used in this paper can obtain more accurate depth information and also has better performance in pose estimation and dense reconstruction on the public TUM dataset.This shows that the method proposed in this paper can not only make up for the shortcomings of traditional monocular SLAM,but also ensure high positioning and mapping accuracy.2)Aiming at the problem that the existing deep learning-based image denoising technology is easy to generate artifacts on sharp edges after denoising,and a single model cannot solve the problem of denoising tasks with different noise levels,this paper proposes a dilated-based image denoising technique.A Gaussian denoising method with adjustable noise level for CNNs.This method enables the trained model to have the ability to adjust the noise level by adding a noise level map that matches the input image.In addition,while reducing the dilated convolution rate,the reversible downsampling technique is used to expand the receptive field of the convolution kernel to obtain better background information without increasing the number of parameters and network layers.The experimental results on public datasets show that the method proposed in this paper has the ability to adjust the noise level while obtaining GPU acceleration,the PSNR value of the objective evaluation is higher and the denoised image is better in terms of details Reserve.It is therefore quite reasonable to apply such a model to subsequent denoising preprocessing.3)Aiming at the problem that the accuracy of the existing deep learning-based camera relocalization methods drops sharply in the face of cluttered scenes or images with too much noise,this paper proposes a camera relocalization method based on image denoising and CNN-LSTM.First,the image denoising method proposed in this paper is used to preprocess the data to alleviate the problem of accuracy degradation caused by excessive noise.Then,CNN is used to learn the depth feature information of denoised images and LSTM is used to obtain more abstract depth spatial information,so that the camera’s 6DOF pose can be regressed.Finally,experiments are carried out on public datasets and compared with the current mainstream algorithms.The results show that the method proposed in this paper can accurately predict the camera’s 6-DOF pose in complex scenes,and it can improve the accuracy of the camera relocalization model.It also enhances the robustness and real-time performance of the system.This paper delves into the improved vision SLAM scheme by fusing multiple neural networks.The indoor positioning and 3D reconstruction method based on monocular depth estimation can obtain better pose and reconstruction effect.Relocalization methods based on image denoising and CNN-LSTM can regress more accurate 6DOF poses.This provides more options for the field of intelligent robots.

Keywords/Search Tags:

Visual SLAM, Deep Learning, Monocular Dpth Estimation, Image Denoising, Camera Relocalization

PDF Full Text Request

Related items

1	Research On Key Technologies Of Monocular SLAM Based On Deep Learning Method
2	Research On Front-End Key Technologies Of Monocular Visual SLAM Based On Deep Learning
3	Research On Key Technologies Of Monocular Visual SLAM Based On Deep Learning Methods
4	SLAM Technology Research Based On Monocular Vision
5	Research On Deep Learning-based Visual SLAM System
6	Monocular Simultaneous Localization And Mapping System Based On Binary Visual Features
7	Research And Implementation Of Deep Learning Based Visual SLAM Technology
8	Research On Camera Pose Estimation On Deep Learning
9	Research On Monocular Visual Semantic SLAM In Complex Environment
10	Research On Monocular SLAM Dense Mapping Method Based On Deep Learning