| With the rapid development of computer vision technology,more and more deep generative content is appearing on social media and spreading rapidly.Among them,face swapping technology is an important branch of deep generative content technology.It can replace faces in images or videos with designated identities while preserving other attributes such as expressions,head poses,and backgrounds.This technology has opened up new avenues for many application areas such as video synthesis and filming,virtual reality and computer games,and biometric and security systems.However,some artificial intelligence generated content,especially face swapping content,due to its strong dissemination,entertainment,and commercial value,also poses potential threats to cybersecurity.Malicious use of face swapping technology can lead to identity theft,the spread of fake news,social engineering attacks,etc.,making face swapping technology an important research direction.Existing methods often struggle to achieve high-quality face swapping effects in practical applications,and the generated results may present various issues such as inconsistencies in appearance,identity,pose,expression,and frame jitter.Moreover,there is a lack of corresponding evaluation standards and a well-established system for the overall face swapping task.The aim of this thesis is to design an algorithm capable of achieving cinematic-quality face swapping effects.Starting from replacement technology,post-processing technology,and evaluation methods,a mask-guided implicit decoupling face swapping technique,a face swapping quality enhancement method based on implicit decoupling,and a face swapping quality assessment method based on the 3D Morphable Model(3DMM)are proposed respectively,forming an organic research whole,thus realizing an implicit decoupling face swapping system that achieves cinematic-quality face swapping effects.The research and implementation of this thesis not only extend the development direction of face swapping technology but also provide a certain guarantee for cybersecurity.The main work and innovations of this thesis are as follows:1.Proposed Mask-guided Implicit Decoupling Face Swapping TechniqueTo solve the inconsistency problem in appearance,identity,pose,and expression of the generated results,a mask-guided implicit decoupling face swapping technique is proposed.This thesis decomposes the goal of achieving high-quality face swapping into four consistency requirements,introduces mask guidance and uses a neural network as an implicit decoupling strategy carrying identity information to achieve face swapping.To accommodate different face swapping task scenarios,two face swapping solutions are proposed:Deep Fake(DF)for strict face swapping tasks and Lightly Improved Auto-Encoder(LIAE)for adaptive face swapping tasks.The former is suitable for strict face swapping tasks,while the latter excels in adapting to facial shapes and lighting.Experimental results show that these two face swapping techniques can both achieve high-quality face swapping in their respective scenarios.2.Proposed a Face Swapping Quality Enhancement Method Based on Implicit DecouplingTo solve the frame jitter problem in the generated results,a face swapping quality enhancement method based on implicit decoupling is proposed.Regardless of the framework used,current face swapping methods will inevitably have some drawbacks,such as information forgetting and frame jitter.To further improve the quality of the results generated by the face swapping algorithms,this thesis proposes a face swapping quality enhancement method based on implicit decoupling.Specifically,a novel neural identity carrier is designed,which uses a U-Net to learn identity transformations from any face swapping proxy(i.e.,the generated results of the face swapping algorithm).Considering that current face swapping tasks cannot produce consistent continuous results for some video-level scenes,to better capture the content information of the face swapping proxy,a stochastic uncertainty loss(data uncertainty loss)is introduced to simulate the jitter in the proxy data,allowing the identity carrier to accurately learn key identity information.In addition,a detail consistency constraint is introduced to ensure that the enhanced results have rich detail information,such as moles and wrinkles.The experimental results from enhancing different types of face swapping videos show that the proposed method effectively alleviates the phenomena of information forgetting and frame jitter in face swapping algorithms,further improving the quality of face swapping algorithms.3.Proposed a Face Swapping Quality Evaluation Method Based on the 3D Morphable Model(3DMM)In response to the current deficiency of evaluation standards for face swapping content,this thesis proposes a face swapping quality evaluation method based on the 3D Morphable Model(3DMM).Although existing generative methods have shown powerful abilities in generating face control,and the generated facial regions have achieved very high realism,there is still a lack of analysis of the inherent 3D characteristics of the generated results.This is because the widely used metrics,such as Inception Score(IS)or Frechet Inception Distance(FID),focus more on perceptual features rather than explicit 3D cues.Therefore,this thesis introduces the 3DMM as an intermediary to model face shape under the scale of the 3DMM,thereby evaluating the identity consistency of the generated results.After proposing the metric,this thesis also optimizes the performance of the existing 3D perception generative adversarial network through regularized optimization of the metric.Experimental results in various population groups show that compared with existing evaluation metrics,the proposed metric can more accurately measure the consistency of identity information.4.Constructed a Full-Process Controllable Implicit Decoupling Face Swapping SystemIn response to the completeness problem of current face swapping algorithms,this thesis constructs a full-process controllable implicit decoupling face swapping system.This system aims to improve the overall quality of face swapping methods,reduce the usage threshold,and provide users with the ability to easily achieve high-quality face swapping effects.From data collection to the integration stage,the system fully considers factors such as gender,race,and camera filters.For the occlusion problem,the system uses a weakly supervised segmentation system to determine the effective face area and integrates various training methods and custom model specifications to improve the expandability of the model.The system also integrates the first three research points.It solves the occlusion scene problem through the implicit decoupling algorithm based on mask training,uses U-Net to enhance the quality of face swapping results,and proposes a video-level consistency quality evaluation scheme based on 3DMM to select the best face swapping result.Experiments under multiple extreme scenarios prove the feasibility and effectiveness of the proposed method in practical application scenarios.Through the research in the above four aspects,the face swapping system designed in this thesis can achieve cinematic-quality face swapping effects,providing new ideas and solutions for the development of face swapping technology. |