Research And Application On Pose Estimation Theory For 3D Reconstruction Based On Implicit Representations And Related Technologies

Posted on:2024-10-11

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Z X Guo

Full Text:PDF

GTID:1528307355971499

Subject:Computational Mathematics

Abstract/Summary:

PDF Full Text Request

Three-dimensional reconstruction is a key research topic in the fields of computer vision and graphics,aimed at restoring real-world scenes in computers.This technology is widely used in virtual reality and augmented reality,becoming one of the important ways of content production in the metaverse,and is also an important part of digital twins.In the field of archaeology and cultural heritage conservation,three-dimensional reconstruction technology can accurately reproduce the three-dimensional models of historical relics and cultural artifacts,providing archaeologists with more flexible and intuitive research methods,and offering the public immersive cultural experiences,thus playing an important role in archaeological research and cultural heritage.Neural Radiance Fields(NeRF),due to their outstanding performance in three-dimensional scene reconstruction and realistic rendering,have attracted widespread attention and are regarded as a potential key technology for scene reconstruction in the fields of virtual reality,augmented reality,etc.However,most NeRF-related research and applications rely on accurate pose data.In the absence of accurate or missing pose data,the effect of three-dimensional reconstruction is poor or the reconstruction fails,so in this case,how to effectively reconstruct the scene is a worthwhile research problem.This paper further studies the joint pose optimization method,which can simultaneously optimize the scene and pose during the training process of the neural network.The main research work and innovative points of this paper include the following five aspects:(1)To address the issue of low reconstruction efficiency in joint pose estimation methods,a joint pose optimization method based on multi-resolution hash encoding,BiResNeRF,is proposed.This method achieves fast and high-precision reconstruction of scenes with inaccurate or missing poses.Firstly,a feature fusion module is introduced,allowing hash encoding modules of different resolutions to effectively participate in the reconstruction task.Then,a two-stage training strategy is adopted,using a smooth warm-up learning rate scheduling strategy and a coarse-to-fine sampling strategy to ensure efficient and stable training.Compared with other algorithms,the reconstruction time of this method is significantly reduced,with an average time reduction of 34.37% on synthetic datasets.Compared with joint pose optimization methods based on MLP,the algorithm proposed in this thesis has more significant improvements in time,reducing by 4 to 18.44 times.It is worth mentioning that the increase in speed does not compromise the accuracy of pose estimation and the quality of rendered images.Finally,the adaptability and effectiveness of the algorithm in specific scenes with no poses,low texture,and reflection are verified through experiments.(2)In research related to rotation representation,first,the sensitivity of the joint pose optimization method to pose differences between images during the reconstruction process is analyzed,focusing on the characteristics of pose representation as trainable parameters in neural networks.By reducing coupling and using a strategy of continuous 6D representation,the stability and accuracy of pose estimation and reconstruction are enhanced.Secondly,the characteristics of different rotational representations as labels in rotation estimation problems are discussed,and a rotation estimation network is used to address the problem of automatic mapping in archaeology.The study shows that the lower the dimensionality of the rotational representation,the easier it is to establish a mapping relationship from point clouds to rotational representation space,which also makes the training process smoother.In addition,a rotation estimation algorithm based on a serial network is proposed.Experimental results show that in a dataset of Buddha statues with few samples and large errors,the error of the serial network is reduced by 30.23% compared to a single-module network.(3)Addressing the applicability issues of the joint pose optimization method in real-world scenarios,this paper introduces a new loss function and a coarse-to-fine two-stage training method.First,this loss function enhances the stability of 3D structures and reduces noise production by adding constraints on depth and opacity to the traditional photometric loss.Secondly,the two-stage training method facilitates the pose estimation and high-precision 3D reconstruction.Finally,experimental validation confirms improvements in pose estimation accuracy and 3D structure stability in real-world scenes achieved by this method.(4)To address the narrow reconstruction range of the joint pose optimization method,a 3D scene merging algorithm based on local reconstruction is proposed,effectively handling the challenges of large-scale 3D reconstruction from sequence images without poses.This algorithm adopts a coarse-to-fine reconstruction strategy for local scenes and introduces an adaptive bounding box position updating algorithm during the reconstruction process to ensure effective optimization by the multi-resolution hash encoding network.Finally,the camera trajectories in overlapping areas are merged together to complete large-scale 3D reconstruction.Experimental results show that the camera trajectories generated by this method are smoother and maintain high precision in 3D structures.(5)This thesis collects and organizes two archaeological-related datasets to evaluate the practical performance of the joint pose optimization method in both local and large-scale archaeological scenes.Experimental results demonstrate that the proposed local reconstruction and large scene stitching methods can effectively handle various archaeological scenes,including ceramics,pottery,bronzes,gold objects,stone statues,and more complex large-scale scenes.These achievements further highlight the application value of this method in the fields of archaeology and cultural heritage preservation.In summary,this thesis proposes solutions to the issues of low reconstruction efficiency,sensitivity to pose differences,poor applicability in real-world scenes,and narrow reconstruction range in the joint pose estimation method,offering new insights for the research and real-world application of joint pose optimization algorithms.Meanwhile,the algorithms proposed in this thesis have been proven feasible and have been partially successfully applied to archaeological and cultural heritage preservation projects in Chongqing,Sichuan,and Guizhou.

Keywords/Search Tags:

3D reconstruction, implicit representation, pose estimation, pose representation, scene merge

PDF Full Text Request

Related items

1	Research On Mixture Part-based Pose Estimation Methods And Their Applications
2	Research On Action Recognition Based On Pose Estimation
3	Shape Representation In 6-DoF Object Pose Estimation
4	6D Pose Estimation Of 3D Objects Based On Monocular Images
5	Research On Human Pose Estimation And Pose Distance Metric Learning
6	Research On Pose Estimation Of Three-Dimensional Models Based On The Obiect Coordinate Representation
7	Research On Key Technologies Of 3D Object Reconstruction Based On UV Representation
8	Research On 3D Human Pose Estimation Method On Monocular Video
9	A Research Of Human Pose Estimation Based On Deep Convolutional Neural Network
10	Researches On Multi-person Human Pose Estimation In Natural Scene