Font Size: a A A

Research On Depth Estimation Algorithms For Monocular Image

Posted on:2019-04-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:H H XuFull Text:PDF
GTID:1368330572955028Subject:Communication and Information System
Abstract/Summary:
Recovering 3D depth information from 2D images is a key research problem in the field of computer vision.Accurate depth information contributes to better understanding the three-dimensional structure of the scene,understanding the three-dimensional relationship between objects in the image,so as to better complete the existing visual tasks.Depth information is widely used in the fields of 3D reconstruction,robot navigation,3D stereoscopic display and virtual reality.However,the depth information of the scene is lost when we get 2D image by an ordinary camera.In addition,because of the short distance perception as well as the low resolution of the captured depth,which strongly impedes the further applications of depth acquisition equipment.So how to recovery depth information from the 2D images or videos becomes an important task in the computer vision area.In this paper,we focus on 3D depth recovery of 2D scenes and study several important issues,including adaptive depth estimation from monocular video,using the data-driven method and multiple depth cues to recover the depth from the single image,single image depth reconstruction based on non-parametric learning in the gradient domain,depth estimation from monocular video based on gradient sample and bi-directional depth propagation.The major contributions of the paper are:1.In order to handle the problem that most existing methods are limited to particular scenario,an adaptive depth estimation framework for monocular videos is proposed.First,the motion type is divided into three classes:no motion(non-object motion and non-camera motion),local motion(object motion only),and global motion(camera motion),where the scene type of global motion can be classified into two classes:stationary object and moving object.Then we design different strategies on the basis of the motion type and scenario type.In addition,the depth generation method for a scenario in which there is no motion,neither of the object nor the camera,is also suitable for the single image.2.A depth estimation algorithm based on a data-driven method and depth cues is proposed.To obtain a perceptually reasonable depth map having a structure that conforms to the actual scene,an input image is first divided into non-object image or object image via learning-based image classification.Then the initial global depth map for a nonobject image is generated by utilizing the data-driven method.We apply an image segmentation algorithm to generate more local depth information.For the object image,the initial depth map can be recovered by leveraging the linear perspective and saliency information.In order to show more depth details and enhance the depth perception,defocus-based depth information is used.3.A new non-parametric learning-based depth recovery framework that makes full use of large scale RGBD(RGB and depth images)datasets in the gradient domain is proposed.The local matching-based depth gradient transfer strategy assists us to generate meaningful depth gradients from similar training images.More importantly,a confidence measure-based depth gradient fusion scheme,which allows us to measure the individual contribution of each pixel in the warped depth gradient maps is introduced.In addition,edge-aware depth gradient refinement process is designed to mitigate the depth gradient outliers and generate accurate depth gradient values.4.An automatic depth extraction algorithm from 2D videos based on non-parametric learning and bi-directional depth propagation is proposed.The depth maps are predicted separately in accordance with the key and non-key frames for each 2D input video.The global depth maps in the key frames are estimated based on gradient samples and are refined by utilizing local information of foreground objects.In the non-key frames,depth maps of key frames are propagated forward and backward across all non-key frames based on bi-directional motion estimation.Fusing is then carried out based on a weighting strategy to generate depth map of each non-key frame.
Keywords/Search Tags:Depth estimation, motion classification, data-driven, depth cue, non-parametric learning, depth propagation
Related items