Research Of Visual Odometry Based On Deep Learning

Posted on:2022-01-29

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Sun

Full Text:PDF

GTID:2518306572959999

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

Simultaneous localization and mapping(SLAM)is an important research issue in the field of robotics.It realizes the task of robot carrying cameras and lidar sensors to locate itself in unknown environment and build a map at the same time.In recent years,with the development of augmented reality and autopilot applications,visual Slam(visual SLAM)has attracted extensive attention.V - Slam uses image as the main perceptual information source,estimates camera pose and constructs 3D scene through multi view geometry theory.Visual odometry is the most important part of v SLAM,which is the basic work of self positioning and map drawing.Recent work shows that unlabeled monocular video can be used to train convolutional neural networks for depth prediction and self motion estimation.However,due to the lack of appropriate constraints,the output scale of the network on different samples is inconsistent,that is,due to the ambiguity of each frame scale,the self motion network can not provide a complete camera trajectory on a long video sequence.In this paper,an end-to-end visual odometer network is designed based on deep learning method.The components are camera self motion estimation network and image depth prediction network.The principle is that one image is transformed into another by the predicted depth and self motion,and the network is trained by using the image reconstruction loss as the supervision signal,It can also be trained only on monocular video.When training on monocular video,the whole network is totally unsupervised.In the traditional visual range localization method,the loop detection of training video itself is often used to assist the calibration of self motion.But in the deep learning method,the loop detection of long video becomes difficult to achieve.Therefore,this paper proposes a method of forming a loop in the relevant frames within 10 frames and adding the supervision signal to the mismatch between self motion and loop to assist training.This idea solves the problem that the scale of depth prediction network is inconsistent with that of self motion estimation network.The test results on Kitti dataset show that the accuracy of self motion estimation is greatly improved by adding local loop detection.Our visual range accuracy is as competitive as the latest models trained with stereo video.Finally,inspired by CNN’s ability to extract relative depth information from monocular images,an object size estimation network is proposed to estimate the actual size of objects in monocular images,so as to obtain the corresponding relationship between real scale and image pixels.By comparing with the previously trained depth prediction network,the ratio between the depth prediction results and the real scale is obtained.Furthermore,this ratio is applied to the self motion trajectory to predict the self motion trajectory with real scale from the monocular video sequence.

Keywords/Search Tags:

Visual Odometer, Deep learning, Local loop detection, Object size estimation

PDF Full Text Request

Related items

1	Research On Visual SLAM Method Based On Deep Learning
2	Research And Application Of Pose Estimation Method Based On Visual Odometer
3	Research On Visual SLAM Algorithm Based On Convolutional Neural Network Multiscale Feature Fusion
4	Research On VIO System Based On Multi-sensor Data Fusion
5	Research On Visual Loop Detection Under Complex Scene Changes Based On Deep Visual Perception
6	Research On Binocular Vision Object Decteion And Localization Based On Deep Learning
7	Monocular Visual Odometer Based On Unsupervised Learning
8	Research On Loop Closure Detection Of Visual SLAM Based On Deep Learning In Dynamic Environment
9	Research On Loop Closure Detection Algorithm Of Visual SLAM
10	Research On Visual Object Tracking Algorithms Based On Deep Learning