Font Size: a A A

Relative Camera Pose Estimation Us-Ing Deep Networks

Posted on:2019-01-31Degree:MasterType:Thesis
Country:ChinaCandidate:L J ZhangFull Text:PDF
GTID:2348330545993381Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
The problem of estimating the camera pose through the visual information is called Vi-sual Odometry(VO).In the last twenty years,VO has been used in the navigation of robots widespreadly.Most of existing VO algorithms are based on the geometric constraint,which in-cludes a series of tedious processes such as feature extraction,feature matching and pose estima-tion etc.The geometric-based method will be influenced by images shoot in rainy or foggy day.And camera calibration is required every time when changing the platform.With the development of deep learning in the recent years,there are some VO algorithms based on the deep learning methods.Which regardless of the tedious processes in gemetric-based methods,and will obtain the pose via an end-to-end way.This paper presents two novel methods based on convolutional neural network(CNN)and recurrent convolutional neural network(RCNN)to estimate the cam-era poses.Which is trained and tested on KITTI VO dataset,and shows comparable results with several VO algorithms based on geometry.The main contributions of this work are as follows:1.Propose a method of generating image's pose label based on the KITTI VO dataset,include absolute poses by giving only one image and relative poses by giving two adjecent images.The generated poses are feeded into the deep networks for training process.2.Propose a method of estimating the relative camera pose of two images called CNN-VO.The input of the network are two adjecent images.The ouput of the network is the 6-d relative pose,include 3-d translation and 3-d rotation.This paper also adds image pairs with reversed order into training process to increase the netowrks generalization ability.3.Propose a method of estimating the relative camera poses of an image sequence called CNN-LSTM-VO.The input of the network are a sequence of raw RGB images,the output of the network are a sequence of relative poses between every two adjecent images.This method benefits from RNN's advantage of processing sequential signals.Which performs better comparing with the method of pure CNN.Meanwhile,this paper increases the training data set by adding sequential images with reversed order to get better performance.
Keywords/Search Tags:Visual odometry, pose estimation, deep learning, convolutional neural networks, recurrent convolutional neural networks
PDF Full Text Request
Related items