Font Size: a A A

Research On Techniques For Depth Estimation From Monocular Image And Video

Posted on:2020-06-10Degree:MasterType:Thesis
Country:ChinaCandidate:R B LiFull Text:PDF
GTID:2428330590458243Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the rapid development of artificial intelligence,many applications including autonomous driving,3D display and robot have gradually entered people's view and affected people's lives.In above applications,how to obtain accurate depth information from scenes has become the focus of research.The main approach for monocular image or video depth estimation is employing deep convolutional neural networks(DCNNs)to learn a direct mapping from image domain to depth domain existing in RGB-Depth datasets.However,the current mainstream algorithms in this field face the following three problems:(1)In monocular video depth estimation,when directly applying the DCNNs for images to the video-based estimation task,there exists severe spatial-temporal inconsistency in predicted depth map sequences,which impairs the quality of 3D video.(2)In monocular image depth estimation,current models cannot adapt to both indoor and outdoor scenes just using a single parameter set,which limits the practicality and robustness of those models.(3)Current models for monocular depth estimation suffer from the problem brought by huge consumption of memory and computation resources,which restricts the application of those models in mobile devices.In this paper,we propose some valid solutions to the above three problems,respectively.For the task of monocular video depth estimation,we propose a recurrent conditional deep filed model.In this method,we combine spatial-temporal conditional random field and generic convolutional neural network in a unified CNN model.In this model,the spatial-temporal dependencies between depth map sequences cloud be established,which improves the accuracy and spatial-temporal consistency of results.For the task of monocular image depth estimation for diverse scenes,we propose a novel deep attention-based classification network.In this model,an attention module is designed to capture the visual characteristics of scenes and a depth classification module is designed to formulate depth prediction as a multi-class classification problem,which makes the model easier to optimize.Experimental results on both indoor and outdoor datasets demonstrate the effectiveness of our method.For the task of monocular depth estimation in mobile devices,we propose a light-weight neural network.In this model,we replace the regular convolution operation with a depthwise separable convolution operation and leverage a novel biweight loss function to train our model.The experimental results demonstrate that our model achieves good performance with less model parameters.
Keywords/Search Tags:Monocular depth estimation, Deep learning, Probabilistic graph model, Attention network
PDF Full Text Request
Related items