| Depth estimation is an important research content of computer vision,which has been widely used in many fields,such as automatic driving,virtual reality,3D reconstruction and so on.Depth estimation can be divided into active methods and passive methods.Active methods include Lidar,time-of-flight method and structured light,etc.However,due to the high cost and easy exposure position of these methods,their application scenarios are limited to a certain extent.Passive depth estimation has become a research hotspot in recent years because of its relatively simple structure and multiple scenarios.Deep learning extracts image features through the calculation of convolutional layer.Meanwhile,with the increase of data scale,the performance of the model can be continuously improved,which has significant advantages in the field of image processing.However,existing monocular depth estimation methods which only rely on deep learning algorithm have some problems such as fuzzy boundary and low resolution.In this paper,based on the optical aperture coding method,we propose an“end-to-end” depth estimation method that optimizes both the optical mask plate and the convolutional neural network.The main research contents of this paper are as follows:The theory of optical aperture coding and deep learning for depth estimation is introduced.The transformation relationship among optical transfer function,point spread function and modulation transfer function and the modulation effect of lens on incident light in aperture coding are analyzed.The feature extraction network and common data sets in deep learning are introduced.The design and optimization process of optical layer aperture coding mask plate is studied.For the optical mask plate,the mask plates with different facial shapes were fitted,and then the fitted parameters were input into the optical design software for modeling and simulation,such as cubic mask plate,exponential mask plate,sinusoidal mask plate,etc.By analyzing the change relationship between the point spread function of the mask plate and the distance of the object,it was concluded that there was a corresponding relationship between the change of distance and the position of the peak value of the point spread function.The distance variation and optical image quality were evaluated to obtain the optimized phase mask surface shape.A monocular depth estimation model based on deep learning is established.Image feature information of different scales was extracted,and then the expressive ability of the model was improved by the attention mechanism module.In the training stage,it is mainly carried out in indoor data set and outdoor data set.Meanwhile,the depth information of data set is completed in indoor data set and outdoor data set respectively.In the quantitative comparison,the accuracy of the model proposed in this paper is improved by 0.37% compared with previous models.Meanwhile,the resolution of depth map prediction and the boundary region of scene objects are significantly improved.The aperture coding monocular depth estimation model of the fused convolutional layer is established.In this paper,the aperture coding is combined with the convolutional neural network to establish a depth estimation model that optimizes both mask plate parameters and convolutional neural network parameters end-to-end.The mask board is used to encode the input image to form a fuzzy intermediate image,and then the encoded image is input into the convolutional neural network for training.The mask board parameters and the convolutional neural network parameters are optimized simultaneously according to the loss between input and output.Through the visualization results,the mask surface optimization process and the accuracy change of the predicted depth image can be directly observed.During the training process,experiments on indoor and outdoor data sets show that the accuracy index of the proposed model is increased by 1.5% compared with that of the model without mask plate,and the predicted boundary contour of the depth map object is relatively smooth,which further demonstrates the effectiveness of the proposed method. |