Font Size: a A A

Depth Estimation From Single Monocular Images Using Deep Learning

Posted on:2017-02-14Degree:MasterType:Thesis
Country:ChinaCandidate:Grigorev AlekseiFull Text:PDF
GTID:2308330509457617Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
One of the main goal of computer vision is the image understanding. Despite the recent success in different tasks, such as object recognition, pose estimation, etc., some of them still remain formidable, however. Much progress has been made in extracting primitives from image and understanding it. Nevertheless, modern robotic systems still have not reached the human level of scene understanding, because algorithms have a high complexity and require a huge amount of modern hardware.Depth estimation is a significant task in the robotics vision. A wide range of vision problems has proven to benefit from the incorporation of the depth information. In this thesis we address the depth estimation from a single monocular image, which is an illposed problem because there are an infinite number of world scenes may have produced the given image. When a huge amount of research was focused on depth estimation from a stereo images or motion, depth prediction from a single image has attracted attention in recent years.To address our main objective, we develop several deep convolutional neural networks that can be applied to predict depth from the image. The first network predicts the depth of a sequence of superpixels. The main idea is process image into the sequential manner, which can decrease network training time. The second hybrid network can be applied to predict the depth of the whole image. The main difference of the current networks for depth prediction is that our network is composed of convolution and recurrent layers, which are composed of the Long Short-Term Memory units(LSTM). The LSTM unit is famous for the ability to memorize long-range context, which can be applied to feature maps and lead to better feature predictions.Our experiments on benchmark dataset demonstrated, that hybrid network can be efficiently applied to predict a depth of an image. Moreover, the composition of recurrent and convolutional layers provide more satisfied results. In addition, we investigate the difference between a standard and hybrid network.
Keywords/Search Tags:CNN, LSTM, depth estimation, monocular image, RNN
PDF Full Text Request
Related items