Font Size: a A A

Research On Face Depth Generation Model For Human-computer Interaction

Posted on:2022-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:Y TianFull Text:PDF
GTID:2518306575983129Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Talking face synthesis technology refers to the generation of a speech facial video in which a picture portrait is synchronized with the voice given a single facial image and any natural speech segment as the generation network input.This technology is a crossmodal generation technology.This technology is dedicated to solving the problem of voice-to-video conversion and has application potential.Generative adversarial networks effectively solve the technical bottleneck of other generative models in the field of crossmodal generation.Therefore,the subject constructed a speech face synthesis model based on a generative adversarial network.The speech face synthesis model maps the natural language segment features and facial image features to the public space in the encoding process,extracts the time dependence of the speech segment according to the gated loop unit,and sequentially generates the frame sequence of the speech face animation.This model conducts adversarial training on the generation quality of face images,and replaces the evaluation distance of JS divergence with f-divergence,which increases the convergence speed of the model loss and improves the generation effect of the network model on face image video frames.In order to verify the alignment of the video frame sequence and the audio,a conditional confrontation network is used to take the audio sequence as a condition and input it into the video frame sequence discrimination network to improve the accuracy of sequence synchronization.Among them,the conditional confrontation network maps the3 D features of the animation to the 2D feature space,which can greatly reduce the demand for network computing resources.In order to verify the effectiveness of each tissue structure of the model,ablation studies were performed on each part of the neural network,and the experimental results were quantitatively evaluated.The results show that the model is superior to the existing models to a certain extent and is a low video resource Provides a way to implement video generation.Figure 24;Table 9;Reference 53...
Keywords/Search Tags:deep learning, generative adversarial network, GAN, face synthesis
PDF Full Text Request
Related items