Font Size: a A A

Research On 3D Reconstruction In Vision Based On Deep Learning

Posted on:2020-10-02Degree:MasterType:Thesis
Country:ChinaCandidate:L Q LiuFull Text:PDF
GTID:2428330575974267Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The technologies of 3D reconstruction in vision are widely used in virtual reality,automatic drive and other fields.Among these technologies,the technology of monocular vision is lower cost and easy deployment.Reconstructing 3D information from 2D images is an essentially ill-posed problem due to lacking of information in monocular vision,and depth estimation is difficult.A method of depth estimation based on deep learning is used in this paper.Then,point cloud is reconstructed from depth map according to camera parameters to achieve the goal of 3D reconstruction in vision.Predicting depth is the core task in 3D reconstruction of monocular vision,thus the approach of 3D reconstruction from a single image is stressed on estimating high quality depth by convolution neural network(CNN)and conditional generative adversarial network(cGAN).The main work of this paper is summarized as follows:(1)Multi-Layer Ensembled Encoder-Decoder Network(MLEED-Net)based on CNN is proposed.This network is an end-to-end network for estimating depth directly according to the input color image.Firstly,at encoder end,Multi-Layer Ensembled Block is proposed.The feature information utilization of encoder network is improved by this block for fusing multi-scale feature.Then,at decoder end,Residual Up-Projection Block is proposed.This block decodes high level semantic information by using multi-receptive-field convolution structure to extract multi-scale feature.(2)A cGAN applied in image domain transformation is improved for estimating depth.The generator and discriminator networks of original cGAN are improved in this paper.Firstly,in generator network,Skip-Convolution Down-Sampling Module is proposed.The accuracy of estimating depth is improved and network parameters are decreased by using this module to replace the encoder block in the generator of original cGAN.Then,in discriminator network,Pyramid Matching Network is proposed.The feature pyramid is added to the discriminator network of original cGAN for using multi-scale feature information to improve the judgement ability of discriminator network.(3)Evaluation of depth estimation and point cloud reconstruction.Several open datasets are used to train and test the networks in this paper.In depth estimation task,on indoor dataset(NYUD v2),compared with the method based on "CNN+CRF",the accuracy of proposed CNN method in ?<25 is improved by 14.2%,and the accuracy of improved cGAN in ?<1.25 is higher than original cGAN by 6.2%.The higher accuracy result of depth estimation is achieved in outdoor dataset(KITTI).In point cloud reconstruction task,3D point cloud is reconstructed from depth map on indoor dataset.The feasibility of point cloud reconstruction is proved.In summary,the proposed methods achieve better performance of depth estimation and 3D point cloud reconstruction in open dataset.The effectiveness of our proposed method for 3D reconstruction in vision is proved.
Keywords/Search Tags:deep learning, 3D reconstruction, depth estimation, convolution neural network, generative adversarial network
PDF Full Text Request
Related items