Font Size: a A A

Monocular Image Depth Estimation Based On Deep Learning

Posted on:2021-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:C X FengFull Text:PDF
GTID:2428330605467468Subject:Mechanical engineering
Abstract/Summary:PDF Full Text Request
Image depth estimation is a key technology for computers to understand and reconstruct 3D image spatial information.It is also a research hotspot in the application of artificial intelligence technology and intelligent picking equipment in real scenes.However,the depth estimation of monocular images is a morbid task with high complexity,so the traditional estimation algorithm is more difficult to implement and the development speed is slower.With the continuous breakthrough of deep learning technology in the field of image processing,this paper designs a deep neural network model to perceive the specific characteristics of objects in the image end-to-end,thereby predicting the spatial distance represented by each pixel of the image,while designing advanced model loss With the perception module,the image depth estimation dataset in various scenes has reached the current advanced level in both accuracy and visual effects.First,this paper designs an image depth estimation model based on transfer learning for encoders and decoders for indoor scenes.To solve the problem of information loss caused by feature fusion during model upsampling,three transfer feature enhancement modules are proposed to further improve the model Accuracy and efficiency.By designing experiments to compare the performance of different migration models in depth estimation tasks,and verifying the superiority of Xception model as a global frame encoder,by freezing the parameters of the migration model,the encoder can obtain effective knowledge prior to large-scale image task training,This not only reduces the model training parameters,but also provides great convenience for the application of the model in actual scenarios.Secondly,for outdoor scenes,this paper combines the current hot feature pyramid module and the channel attention mechanism to design an end-to-end trainable and efficient architecture.From the three main steps of knowledge enhancement,feature extraction and feature multi-scale,the global architecture is upgraded to The current level is more advanced.At the same time,the depth map smoothing loss and scale consistency loss functions are proposed to improve the high-frequency detailsof the prediction results and optimize the parameter convergence speed during the training process.After a large number of experimental comparisons,the method designed in this paper has achieved the current good estimation accuracy in the indoor and outdoor scene data sets NYU V2 and KITTI,and the current performance of the monocular is good in terms of mean square error,threshold accuracy and other indicators.Compared with the image depth estimation technology,the method proposed in this paper has better effect and robustness,which provides a favorable technical basis for the subsequent intelligent picking equipment.
Keywords/Search Tags:Image depth estimation, supervised learning, scale invariance, transfer learning
PDF Full Text Request
Related items