The task of face image synthesis is to make the generated faces have the identity information of the source images and the appearance characteristics of the reference images.Image synthesis driven by artificial intelligence can simplify the workflow of photographers and creators,and bring creativity and digital art to a new level.In the existing face image synthesis work,when the face pose of the source and reference images do not match,the generated results often appear artifacts and unnatural areas.In addition,extracting local styles of reference images in face image synthesis task often requires paired images and semantic segmentation masks,which sets up a barrier for users to interact with images.In order to solve the above problems,this thesis proposes the En Style GAN_Ⅰ network based on attentionguided alignment and the En Style GAN_Ⅱ network based on Feature Pyramid Networks respectively,taking Generative Adversarial Network as the technical route.The main research contents of this thesis are as follows:1.Image synthesis based on generative adversarial network is studied.Firstly,the principle of generative adversarial networks is analyzed,several classical generative adversarial networks are discussed,and the loss function and characteristics of generative adversarial networks are summarized.Secondly,the representative works of generative adversarial network in image synthesis task are discussed,including Gau GAN based on Spatially-Adaptive Normalization,SEAN based on Semantic Region-Adaptive Normalization,Co Cos Net based on Weakly Supervised Learning,and multiple domains and diverse image synthesis work Star GANv2.Finally,the work of GAN inverse is summarized,and the latent space work is discussed.2.The En Style GAN_Ⅰ network based on attention guided alignment is designed and implemented to solve the problem of face semantic misalignment between source image and reference image.En Style GAN_Ⅰ is extended with Style GAN as the backbone network,which consists of Content/Style/Semantic Encoder GE,Spatial Feature Transform layer SFT,Attention-Guided Alignment module AGA and feature translation network M2 S.En Style GAN_Ⅰ calculates the semantic correlation matrix of the source image and the reference image through the attention-guided alignment module AGA,and aligns the style of the reference image with the source image.On the basis of generative adversarial loss,image reconstruction loss and perceptual loss,En Style GAN_Ⅰ uses correspondence regularization to carry out semantic constraints on tasks.The experimental results show that,En Style GAN_Ⅰcan achieve more accurate style transfer results when the source image and reference image semantics are not aligned.Compared with the mainstream face image synthesis models Star GANv2 and SEAN,the FID index of En Style GAN_Ⅰ in face synthesis task decreased by4.46 and 3.26,respectively.3.The En Style GAN_Ⅱ network based on feature pyramid is designed and implemented for local editing task of extracting local style from reference image in face image synthesis.Based on Style GAN,En Style GAN_Ⅱ designs feature extraction network FEN,style transformation network F2 S and local editing module LE.FEN draws on the structure of the Feature Pyramid Networks into bottom-up,top-down and lateral connections.When the encoder is from bottom to top,bottleneck block is used to extract image features via 1 × 1convolution.Besides,high resolution layers are constructed from top to bottom by up-sampling,and connections are added between the corresponding feature maps to ensure that the semantic position will not shift.The local editing module LE uses the element multiplication to fuse the style codes from different images to achieve local editing.On the basis of generative adversarial loss,image reconstruction loss and perceptual loss,En Style GAN_Ⅱ uses style code reconstruction loss and discriminator gradient penalty.The experimental results show that,compared with SEAN,the mainstream face image synthesis model for local editing,En Style GAN_Ⅱ can generate more natural results and the area edited by En Style GAN_Ⅱ can be well integrated into the surroundings without abrupt parts.Compared with SEAN,the average accuracy AP of En Style GAN_Ⅱ decreased by about 8%,and the mean square error based on mask decreased by 0.063. |