Font Size: a A A

Detailed Human Reconstruction From Monocular Videos

Posted on:2024-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y J LiFull Text:PDF
GTID:2568306944961459Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Clothed human reconstruction has been a challenging research topic in both industry and academia.In the film and games industries,high-fidelity human reconstruction often requires pre-captured templates,multi-camera dome,professional studio,and the prolonged effort of artists.These strict requirements exceed the application scenarios of most ordinary users,including personalized avatars for telepresence,AR/VR,virtual fitting,anthropometry,etc.Therefore,directly reconstructing high-fidelity human avatar from monocular videos has significant practical value.To efficiently and accurately capture the 3D geometry of dynamic human from monocular videos,we propose a novel method that could obtain spacetime coherent surfaces of clothed human from RGB videos.Compared with previous work,our method capture a more detailed geometry and faster training speed.To achieve these objectives,we adopt a neural implicit function for surface representation in canonical space and employ a forward deformation field mapping the canonical space to observation space for information integration.Following these,we propose to optimize the inverse process of the forward deformation with estimated depth,which improves the accuracy and efficiency of finding corresponding points between the canonical space and the observation space while prevent noise introduced by inconsistent point correspondence.Additionally,the optimized geometry of human avatar is used to improve the depth estimation during training.This feedback loop promotes more reliable reconstruction.Furthermore,we improve the structure of the forward deformation by using a homeomorphic mappings for local deformation of garment,while skeletal skinning transformation for topology-variant caused by significant movements of actor.Since homeomorphic mappings is naturally cycle-consistent,this decomposition alleviates geometry-texture ambiguity during reconstruction.Experiments on the real-world datasets,PeopleSnapshot and iPer,as well as the digital human dataset Multi-Garment,demonstrate that our method has significantly finer surface geometry and faster training speed than other methods,while have a comparable appearance quality to the state-of-the-art.Qualitative experiments demonstrate that our method can reconstruct human surfaces with more details,higher accuracy,and better restoration of facial features,thus producing higher-fedility geometry.
Keywords/Search Tags:human reconstruction, monocular video, animatable human, neural implicit function, volume rendering, depth estimation
PDF Full Text Request
Related items