Detailed Human Reconstruction From Monocular Videos

Posted on:2024-03-20

Degree:Master

Type:Thesis

Country:China

Candidate:Y J Li

Full Text:PDF

GTID:2568306944961459

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Clothed human reconstruction has been a challenging research topic in both industry and academia.In the film and games industries,high-fidelity human reconstruction often requires pre-captured templates,multi-camera dome,professional studio,and the prolonged effort of artists.These strict requirements exceed the application scenarios of most ordinary users,including personalized avatars for telepresence,AR/VR,virtual fitting,anthropometry,etc.Therefore,directly reconstructing high-fidelity human avatar from monocular videos has significant practical value.To efficiently and accurately capture the 3D geometry of dynamic human from monocular videos,we propose a novel method that could obtain spacetime coherent surfaces of clothed human from RGB videos.Compared with previous work,our method capture a more detailed geometry and faster training speed.To achieve these objectives,we adopt a neural implicit function for surface representation in canonical space and employ a forward deformation field mapping the canonical space to observation space for information integration.Following these,we propose to optimize the inverse process of the forward deformation with estimated depth,which improves the accuracy and efficiency of finding corresponding points between the canonical space and the observation space while prevent noise introduced by inconsistent point correspondence.Additionally,the optimized geometry of human avatar is used to improve the depth estimation during training.This feedback loop promotes more reliable reconstruction.Furthermore,we improve the structure of the forward deformation by using a homeomorphic mappings for local deformation of garment,while skeletal skinning transformation for topology-variant caused by significant movements of actor.Since homeomorphic mappings is naturally cycle-consistent,this decomposition alleviates geometry-texture ambiguity during reconstruction.Experiments on the real-world datasets,PeopleSnapshot and iPer,as well as the digital human dataset Multi-Garment,demonstrate that our method has significantly finer surface geometry and faster training speed than other methods,while have a comparable appearance quality to the state-of-the-art.Qualitative experiments demonstrate that our method can reconstruct human surfaces with more details,higher accuracy,and better restoration of facial features,thus producing higher-fedility geometry.

Keywords/Search Tags:

human reconstruction, monocular video, animatable human, neural implicit function, volume rendering, depth estimation

PDF Full Text Request

Related items

1	Research On Key Technologies Of Human 3d Motion Estimation Based On Markless Monocular Video
2	The Study Of Human Pose Estimation In Monocular Video
3	Efficient And Convenient 3D Human Body Reconstruction
4	Monocular Vision Based 3D Reconstruction And Human Action Recognition From Skeleton Data
5	The Analysis And Reconstruction Of Human Motion Gesture Based On The Video
6	3D Human Pose Estimation Based On Monocular Video
7	3d Markerless Human Pose Estimation Based On Monocular Video Sequences
8	On Volume Rendering Reconstruction Of Virtual Chinese Human And Its Parallelization
9	3D Human Motion Reconstruction From Monocular Video
10	Research And Implementation Of 3D Human Reconstruction Algorithm Based On Multi-Attribute Prior