Font Size: a A A

Depth Estimation From A Monocular Image

Posted on:2016-05-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:H TianFull Text:PDF
GTID:1108330482957833Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Recovering 3D depth information from images is an important issue in computer vision. If we can estimate the 3D structure of scenes accurately, we can better understand the scene by knowing 3D relationships among objects in the image. It will promote developments of many applications in computer vision, such as robtics, video surveillance and 2D-to-3D.For these reasons, this thesis mainly focuse on how to estimate depth from a monocular image, including depth estimation from videos with camera motion and depth estimation based on machine learning. With these techniques, the difficulties in depth estimation, such as the applicability of algorithms, similarity measurement of 3D structure, feature selection and depth smoothness, can be effectively addressed. It greatly improves the accuracy of depth estimation. In summary, the main contributions of this thesis are summarized as follows:1) We propose an algorithm that can recover depth from a monocular video. In order to handle the problem that most existing methods are limited to particular scenario, we propose an effective depth estimation framework for monocular videos with or without camera motion by combining existing methods. With this framework, not only the depth of background but also the depth of foreground can be recovered. Especially when the camera is in motion, we propose a novel global motion estimation method including effective outlier rejection to enhance the accuracy of moving objects extraction. It gauranted the extracted moving objects have more complete contour in estimated depth maps.2) We propose a depth estimation method based on metric learning. In order to measure the similarity of 3D structure between images, we propose a depth sampling method based on a learned Mahalanobis distance instead of traditional Euclidean distance, for depth estimation from a single image. It can effectively improve the results of depth sampling. And we construct a loss function about the parameters matrix in this Mahalanobis distance and realize optimization by the generated training data which can reflect the similarity of 3D structure between images. In addition, in order to address the slow depth fusion problem, we propose a fast method based on Gaussian weighting function for depth fusion.3) We propose two deep learning algorithms for depth estimation from a single image. First, we propose to use convolutional neural network to model the relationship between the raw image and depth. It can automatically learn good representative features from a large image window for depth estimation, this effectively addressed the problem of feature ambiguity. Meanwhile, we apply the trained network convolutionally to the entire raw image to simultaneously produce depths of a bunch of pixels, this greatly reduced computation times at test. Second, in order to handle the depth smoothing problem, we propose a deep network which combines convolutional neural network and conditional random field. This network models not only the relationship between raw image and depth but also the relationship between depths in different positions. We realize the training of this network by optimizing the loss function of the conditional random field. Our network can generate depth maps with more clear objects’contour. We achieve both accuracy and smoothness of depth estimated by this network. Compared with existing depth estimation algorithms, the two proposed algorithms don’t require any engineered features and make no assumption about semantic information of the scene, therefore our algorithms have stronger applicability.
Keywords/Search Tags:depth estimation, global motion estimation, metric learning, deep learning, convolutional neural network, conditional random field
PDF Full Text Request
Related items