Font Size: a A A

Research On Video Generation Based On Human Pose Transfer

Posted on:2022-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2518306512971889Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
As a new video generation task,video-based human pose transfer has many application scenarios in the artificial intelligence industry.It can be used to automatically edit the pose of the human body in videos,and it can also be widely used in the fields of short video production,animation production and virtual reality,thereby improving the research level of intelligent production technology of multimedia content and promoting the development of computer vision science and video social industry.At present,the mainstream research idea of video-based human pose transfer algorithm is completed in two stages:human pose estimation and human pose generation.However,affected by the accuracy of the attitude estimation and the construction of pose generation network,the results of pose transfer will be unreasonable or have poor quality.To this end,this paper conducts further research on this problem and builds a human pose transfer model based on convolutional neural networks.For the estimation of human body pose,when the body is affected by factors such as occlusion or excessively fast motion of video characters,there will be missing or wrong 2D pose of human body.In response to this problem,this paper proposes a human body 2D pose estimation model based on dual estimators.By introducing a pose estimation network with predictive performance,the missing key points are complemented and wrong key points are corrected to obtain a more complete and accurate the 2D pose of the human body.And the 2D pose will be used as the input of the later pose generation network to guide the network to perform accurate pose transfer.For the generation of human pose,in order to improve accuracy,this paper constructs the following structure for the human body pose generation network model,1)In the encoder part of the pose generation network model,the Self-Attention mechanism is introduced to enhance the network's ability to express the structural features of the human body's 2D pose,thereby improving the quality of the character's actions in the result of the pose transfer;2)Introduce the human body extractor to construct a mask loss module to correct the overall network loss,and provide gradient information of character appearance and video background for network training,thereby improving the character appearance and video background quality in the result of pose transfer,making the character appearance more realistic and the video background more stable.Finally,the test results on the iPER data set and PRCV competition data verify the effectiveness of this method.
Keywords/Search Tags:Human body pose transfer, Human body pose estimation, Pose generation network, Self-Attention mechanism, Mask loss
PDF Full Text Request
Related items