| With the development of science and technology and the improvement of people’s living standards,3D related technologies represented by virtual reality and augmented reality have penetrated into every aspect of daily life.From large parties to international events,from 3D games to film and television creation,and from driving navigation to automatic driving,people are increasingly demanding 3D related technologies in their daily life,and at the same time,the requirements for 3D related technologies are increasingly stringent.As a matter of course,this also puts forward high expectations and requirements for reconstruction technology,the cornerstone of 3D technology.In many reconstruction techniques,depth estimation has outstanding representativeness due to its wide application prospect and relatively loose application conditions.As one of the basic researches in computer vision,depth estimation aims to estimate the depth information of each pixel in a given image,and then realize the transformation from color image to depth map.The early depth estimation mainly used traditional machine learning methods to recover 3d information of a given scene with the pre-estimated(or given)depth map prior.The prior estimation of depth maps usually adopts the linear regression method,which leads to the limited quality and clarity of the final output depth map.Therefore,the early depth estimation has very high requirements for application scenarios.In recent years,deep learning,with its flexible application scenarios and great generalization characteristics,makes it shine in all fields of computer vision.With the development of deep learning technology,depth estimation has achieved more outstanding performance,and thus shows a broader application prospect.However,depth estimation is a reverse process from 2D image to 3D scene projection in essence,which leads to the process of backprojection usually with ill-posed characteristics.In addition,due to the discretization of the real world scene by the camera in the sampling process and the near large and far small effect of the camera lens,semantic discontinuity is easily generated in the process of depth estimation,which further aggravates the impact of the back-projection problem on depth estimation.The specific research content is divided into the following three aspects.First,the traditional depth estimation algorithm provides prior information for depth information by learning the mapping relationship of the data set,and pays no attention to the ill-posed characteristics caused by the reverse projection from 2D images to 3D space.As a result,the priori information of the initial estimated(or preset)depth map only serves as the initial state of algorithm iteration,and does not provide effective help for subsequent machine learning algorithms.Moreover,because the machine learning algorithms in traditional depth estimation problems themselves need to solve complex graph optimization problems,researchers tend to design the graph model as simple as possible,resulting in the algorithm can only extract limited local characteristics in the input image.In this thesis,a dictionary learning algorithm based on image patch retrieval is proposed to calculate the prior information of depth maps.The prior information is introduced into the depth map to be solved by Taylor expansion.Compared with the traditional method,the dictionary learning algorithm based on image patch retrieval can not only estimate the prior information of depth map well,but also make the estimated prior information have a higher amount of information.At the same time,by introducing second-order information into the machine learning method,the proposed algorithm takes into account the correlation characteristics of higher dimensions between image pixels,so that the model has better local characteristics.Secondly,many researchers have noticed the semantic discontinuity problem except for the back projection problem,and proposed to introduce auxiliary information to weaken the ill-posed characteristics,so as to enhance the accuracy of depth estimation.The problem of semantic discontinuity is solved by completing the scene information,including target detection,semantic segmentation and object contour,so as to constrain the variation range of depth values between adjacent pixels and the solution space range of pixels themselves.On this basis,this thesis proposes a neural network algorithm to estimate object depth by completing viewpoint information under the guidance of semantic information.This algorithm can not only reduce the ill-posed problem of backward projection in depth estimation by completing the view points,but also effectively improve the accuracy of depth estimation algorithm by introducing semantic information into the neural network to improve the relationship between adjacent pixels.Third,due to the great generalization characteristics and learning ability of deep learning itself,existing depth estimation algorithms mainly focus on the design of network structure,and pay insufficient attention to the optimization method of depth estimation model itself.In this thesis,an optimization framework based on hyper anchor graph is proposed to replace the original loss function of deep learning to optimize neural network.Compared with the simple neural network structure,the graph neural network proposed in this thesis combines the controllable characteristics of the optimization model of the traditional depth estimation algorithm.By adding effective constraints,the training of the neural network can get effective convergence.At the same time,in order to further weaken the ill-posed problem in depth estimation,aiming at the problem of semantic discontinuity,this thesis proposes to integrate semantic information into graph model,so as to further enhance the reasoning ability of neural network for depth estimation problem. |