| Due to the existence of a large amount of face data on social media and the rapid development of deep learning technology,it has led to the emergence of deepfake generation technology.Deepfake generation technology refers to the synthetic face,which should maintain the identity information of the source face,and at the same time ensure that it is consistent with the attribute information(expression,pose,lighting,etc.)of the target face.Deepfake generation technology has been widely used in the fields of movies and entertainment,but the face results synthesized by current generation methods have the problems of low-quality generation and insufficient retention of facial information.In view of this,the main work and innovations of this thesis are as follows:(1)Aiming at the problems of insufficient preservation of the pose and expression of the target face in the results generated by the existing deepfake generation methods,and poor generation quality on high-resolution images,a Transformer-based deepfake generation method is proposed.First,to overcome the loss of facial feature information in the encoding process,a cross-window face encoder based on Swin Transformer is proposed.Afterwards,an identity generator is proposed to reconstruct identity-specific high-resolution images with high quality,and it fully utilizes the Transformer attention mechanism to enhance the face identity information preservation in the reconstruction process.Finally,the face conversion module is used to synthesize the result,which ensures that the pose and expression of the target face are fully preserved during the conversion process through precise positioning of face key points and interpolation mapping.(2)Aiming at the problems of insufficient retention of identity information of the source face in existing deepfake generation methods,difficulties in generating faces between arbitrary identity,and low-quality generation,a deepfake generation method based on identity injection is proposed.First,to extract the identity information features and attribute information features of the target face,a face feature encoder is proposed.Then,a dual-attention-based identity information injection module is proposed to complete the embedding of source face identity information features into target face feature information.Finally,a high-quality face generator synthesizes the result by upsampling.And during the model training process,the identity information is constrained by utilizing the cosine similarity identity loss function.Since the transformation of identity information will lead to changes in face attribute information,the expression loss function and reconstruction loss function are further used to constrain the attribute information.(3)The generation methods proposed in this thesis are tested on multiple face datasets,including the Celeb A-HQ dataset and the FFHQ dataset.From the experimental results in multiple datasets,the generation methods effectively preserve the identity information of the source face and the attribute information of the target face,and achieve good visual effects.With the help of video datasets Face Forensics++ and Ko DF produced by existing generative methods,the model methods are compared with multiple advanced generative methods.Through the qualitative and quantitative analysis in the comparative experiment,the model methods proposed in this thesis can perform better than other advanced generative methods in expression and posture control while completing high-quality generation tasks,which further prove the effectiveness of the model methods. |