Font Size: a A A

Research On 3D Object Reconstruction Based On Deep Convolutional Features

Posted on:2021-05-08Degree:MasterType:Thesis
Country:ChinaCandidate:R SunFull Text:PDF
GTID:2428330647967241Subject:Mechanical and electrical engineering
Abstract/Summary:PDF Full Text Request
Applying image information for accurate three-dimensional(3D)reconstruction has always been one of the important tasks in the research field of machine learning and computer vision.The real world is 3D,while the captured images and image sequences are two-dimensional.Information such as color,shape,texture,and depth of the objects contained in the image is essential to recover an accurate 3D geometry.Although human can infer the 3D structure of a scene and the shapes of objects from limited information due to a strong prior knowledge.It is extremely challenging for computers to reconstruct an 3D object or a scene from one or multiple viewpoints.In essence,an image does not have a oneone correspondence with its 3D structure,leading to the lack of inherent information in constructing a 3D geometry from a single image.With the rapid development of convolutional neural networks(CNN)in the field of 3D vision,it has been possible to reconstruct the object based on a single image.However,there are still some problems to be solved in the single-view reconstruction task,such as lack of supervised items,noise points on the surface of the reconstructed point cloud model,and lack of details on the reconstructed surface of the input image with a blurred perspective.Therefore,this paper starts from the problems existing in the single-view object reconstruction and conducts further research on it.The main innovations and contributions are summarized as follows:Firstly,in order to solve the problem of lack of supervised items for single-view object reconstruction,a point cloud generation network based on self-supervised learning is proposed.The network first generates a rough initial point cloud from the input single RGB image,and then restores the initial point cloud into a binary image for comparison with the input image.The mean square error of the two images is used as the loss function and is continuously optimized during the training process to obtain a relatively accurate 3D point cloud generation model.The self-restraint end-to-end training model is formed between the reconstructed binary image and the input image,which provides a new self-supervisory signal for single-view reconstruction.This method performs well in the Shape Net dataset.Secondly,to reduce the noise points on the surface of the reconstructed point clouds,a 3D reconstruction optimization method based on variational auto-encoder(VAE)is proposed.The spatial features extracted by the variational encoder are the distribution of the features rather than the feature samples.The 3D point cloud corresponding to the input image is calculated by the distribution function.By using the relatively robust generating ability of VAE,a point cloud model with fewer noise points can be generated.Compared with the state-of-the-art,such as 3D-LMNet and PSG-Net,the 3D geometry generated by the proposed method has lower reconstruction error rate.Finally,to enhance the generalization of the network,Pose Net for estimating the image pose is designed to be integrated into the self-supervised learning method.By predicting the pose information of the input image,the point cloud generation network can distinguish the image perspective,which eliminating the reconstructed loss caused by the pose ambiguity of the input image.A 3D model with a unique direction is obtained by combining the image pose information with the reconstructed point cloud,which provides more accurate input for the next step of rendering the binary image,and proposes a better optimization direction for the point cloud generation network.
Keywords/Search Tags:CNN, 3D reconstruction, self-supervised learning, auto-encoder, point cloud
PDF Full Text Request
Related items