Font Size: a A A

Non-rigid Structure From Motion Based On Unsupervised Neural Network Model

Posted on:2024-09-09Degree:MasterType:Thesis
Country:ChinaCandidate:X Y PengFull Text:PDF
GTID:2568307115992809Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Nonrigid Structure from Motion(NRSf M)based on monocular camera is one of the research hotspots in computer vision.Since the observed object is always in non-rigid motion during the whole observation process,the problem is essentially an underconstrained problem.Considering only a single reprojection constraint does not lead to a unique and accurate solution,and additional constraints are required.The existing NRSf M solution can deal with 3D reconstruction in the case of simple motion well,but it still has defects when facing the situation with complex motion or the situation with a large number of feature points.In addition,due to the lack of a large number of high-precision NRSf M datasets,it is more reasonable to train the network in an unsupervised manner,and the reconstruction results can be compared with existing methods.To solve the above problems,this paper proposes two NRSf M solutions based on unsupervised neural network models,which are respectively applied to sparse and dense data sets of complex motion.The main contributions of this paper are as follows:1.On the sparse data set of complex motion,based on the invariance and closure as the theoretical basis,this paper proposes a self-supervised network and WGAN-GP network(Wasserstein Generative Adversarial Networks with Gradient Penalty,WGAN-GP).Generative adversarial networks with gradient penalty)sparse 3D motion reconstruction algorithm.According to the invariance theory that the reconstruction results of 2D observations of the same 3D structure under any perspective should be similar,this paper applies graph convolution to 3D motion reconstruction for the first time,and proposes a self-supervised network based on graph convolution and Transformer encoder.Based on the closure theory of two-dimensional projection probability distribution similarity,a two-dimensional structure discriminator is added to the above self-supervised network to form the WGAN-GP architecture.In this paper,we analyze the importance of the adjacency matrix as prior knowledge and the effectiveness of the 2-D structure discriminator through extensive experiments.2.On dense datasets,we propose Reconstruction and Optimization Neural Network(RONN).RONN uses depth estimation instead of directly solving 3D structure,which reduces the theoretical calculation amount.RONN network is mainly implemented by convolutional neural network and has three modules for fusion,reconstruction and optimization respectively.The loss function mainly consists of two:the temporal smoothing loss and the Procrustes-alignment loss.The Minimum Singular Value Ratio(MSR)is used to weight both temporal smoothing and Procrustes-alignment.Experimental results show that in the sparse case,the proposed model based on self-supervised network and WGAN-GP network shows superior performance in the benchmark data set and the CMU MOCAP data set.RONN has excellent reconstruction performance in the dense case,and also has a good reconstruction effect on sparse datasets.
Keywords/Search Tags:Non-rigid Body, Three-dimensional Reconstruction, Graph Convolution, Generate adversarial network, Minimum singular value ratio
PDF Full Text Request
Related items