Font Size: a A A

Research On Person And Facial Image Synthesis Algorithm Based On Generative Adversarial Networks

Posted on:2022-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:C F WeiFull Text:PDF
GTID:2518306527983109Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Image synthesis has a wide range of application value in data enhancement,image restoration,artistic creation and so on.It is an important research direction in computer graphics,computer vision and other fields.In recent years,with the introduction of Generative Adversarial Networks(GAN),it has attracted a large number of scholars at home and abroad to participate in the research and produced many excellent scientific achievements.Compared with the traditional method,Generative Adversarial Networks are more efficient in the task of image synthesis.Therefore,image synthesis is also the most important application field of Generative Adversarial Networks.Although Generated Adversarial Networks have been able to synthesize realistic image,there are still some deficiencies in cross-modal image synthesis and image synthesis involving spatial relationships.Artifacts tend to appear in the results of synthesis.In this paper,relevant studies have been carried out to address these issues.The specific research contents are as follows:(1)A pose guided and attributes editable person image synthesis method is proposed.Aiming at the problem that the existing pose guided human body image synthesis methods can not edit the appearance attributes of human body image,this paper uses the idea of virtual fitting for reference,carries out semantic segmentation on the input reference human body image,obtains the appearance attributes of different parts of human body,then,human images with target posture and source appearance attributes can be synthesized by modifying the attributes that need to be edited and then entering them into the network.;In order to solve the problem that the pattern and texture of the clothing will be lost when person image is synthesized by the pure convolution structure Generative Adversarial Networks and makes the final synthesized person image more fuzzy,in this paper,a new spatial transformation algorithm is proposed,which implement the spatial transformation of the feature map of multiple appearance attributes of source person image and fuses them into the target person image,and each appearance attribute can be edited independently without affecting each other,and the patterns and textures on clothes can be retained in the transformed person image.The experimental results show that the proposed method can control the appearance and pose of target person image simultaneously,and has higher quality than other methods.(2)A two-stage audio-driven face image synthesis method is proposed.In the first stage,the speech features are disentanglemented to get the speaker's identity information and semantic information.The disentanglemented features are input into the encoder to learn the speaker's facial movement features,make the generated face key points contain abundant head motion information,and the visual effect of the final synthetic face will be more natural.Faces synthesized by existing methods are prone to artifacts.In the second stage,the image features are concatenated with the speech features of the first stage,and then input into the encoder to learn the spatial attention mask.Spatial attention mask is used to assign weights to the image features,so as to eliminate the artifacts that may occur in the phase of face image synthesis and improve the quality of synthesis.Finally,the decoder is used to synthesize the target face image.The experimental results show that the proposed method achieves a more vivid synthesis of facial expressions and actions,and has a more natural and better visual effect than the existing methods.(3)A two-stage audio-driven virtual anchor image synthesis method is proposed.In the first stage,an audio driven gesture key points generation algorithm based on ST-GCN is added to the previous audio driven face key point generation algorithm,so that the proposed method can learn the facial and gesture actions from the speech information at the same time.In the second stage,the image synthesis method based on the feature spatial transformation is used to eliminate the artifacts caused by arm movement,and ensure the quality of image synthesis;In view of the lack of unified gesture generation evaluation methods at present,a new gesture generation evaluation method is proposed.Experimental results show that the proposed method can generate natural and vivid facial and gesture actions at the same time,and has high quality of image synthesis.
Keywords/Search Tags:generative adversarial networks, image synthesis, disentanglement, spatial transformation, audio driven
PDF Full Text Request
Related items