Font Size: a A A

Research And Application Of Image Depth And Pose Estimation For Monocular Camera

Posted on:2020-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y JiangFull Text:PDF
GTID:2428330605469364Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Reasoning the stereo structure of scene is a common task in computer vision.Its basic purpose is to infer the stereo structure of scene with the plane information acquired by sensors.Many studies based on binocular cameras or distance sensors have been carried out to solve such problems.The methods based on monocular camera can only use some local two-dimensional information,so it will encounter more challenges in reasoning the stereo structure of scene.But these methods rely less on sensors and can solve problems with limited information,so they are of great research value.The two major problems we need to solve are transforming two-dimensional information into local stereo information and reasoning the relationship between this local stereo information.The former is called depth estimation,and the latter is called pose estimation.This thesis proposes a better algorithm for Pn P(Perspective-n-Point)problem in depth and pose estimation based on feature points,and combines SLAM(Simultaneous Localization and Mapping)to design experiments to verify the effectiveness of the algorithm in common data sets and real scenes.In order to reduce the cumulative error of pose,a key frame selection strategy is designed to ensure the quality of key frames and limit the number of key frames;Aiming at the problem that outliers affect the accuracy of experiments,a matching points selection strategy is designed to eliminate outliers.Experiments in data sets and real scenes show that the proposed algorithm can effectively reduce errors.Aiming at the unsupervised depth learning based monocular depth and pose estimation algorithm,a three-dimensional point cloud based stereo optimization algorithm is proposed to optimize the estimated depth and pose.In the process of estimating pose and reconstructing depth map,not only the generated depth map itself keeps local smoothness and structural similarity with the original image,but also the estimated pose is utilized to make the spatial points(point clouds)coincide with each other as much as possible.The algorithm uses the iteration algorithm of point cloud registration to align these spatial points,and drops outliers in many iterations to achieve more accurate results.The experimental results show that the proposed algorithm performs better than the state-of-the-art experiments in improving the accuracy.In order to improve the accuracy of depth estimation,according to the intrinsic relationship between depth map and semantics segmentation map,a convolutional neural network with multi-task joint training for depth estimation is designed in this thesis.Our network uses an encoder-decoder architecture.The rich features obtained by the encoder from the image are input into the depth estimation decoder and the semantics segmentation decoder respectively for the reconstruction of the depth map and the semantics segmentation map.In order to connect the two decoders,an information sharing strategy is designed in the two decoders to maintain the inherent consistency between depth maps and semantics segmentation maps.The experimental results show that the depth values generated by our network model are more accurate than the state-of-the-art experiments.
Keywords/Search Tags:depth estimation, pose estimation, monocular camera, deep learning, semantic segmentation
PDF Full Text Request
Related items