Face-to-caricature style transfer is a computer vision technology that applies caricature style to human faces,generating new images with exaggerated style while preserving the structural details of the face.It has been widely used in various fields such as art creation and social media,and has important research value.In view of the small number of datasets with limited applicability in this field,the mismatch between the generated caricatures’ style and facial details,the insufficient artistry,and the defects of flickers and out-of-sync expression in the generated caricature video,this thesis conducts in-depth research and analysis.This thesis constructs a high-quality face caricature dataset,and based on the generative adversarial networks,algorithms are designed in two directions of image and video generation,which improve the quality and artistic effect of the result to a certain extent.The main research contents of this thesis are as follows:1.This thesis constructs a high-resolution paired dataset suitable for the field of faceto-caricature style transfer.The dataset in this thesis contains 1600 high-resolution face and caricature images which are paired corresponding to 200 people.All images have completed facial feature alignment and data augmentation.Compared with the public caricature datasets,this dataset has high resolution,aligned face,consistent and uniform style distribution,and is easy for the neural network to learn the correspondence between faces and caricatures,with higher quality.2.For the face to caricature image generation,this thesis studies the face to caricature generation algorithm based on the multiple attention mechanism.Aiming at the fuzzy and chaotic problems shown in the existing work in the task of generating face to caricature images,this thesis uses a multi-attention guidance module to improve the feature extraction and generation capabilities of the generator for key parts and important semantic features,and improve the discriminative ability of the multi-scale discriminators for high-resolution images? pixel-level constraints,frequency-level constraints and semantic feature level constraints work together to make the distribution of generated caricatures closer to the real caricatures domain.3.For the face to caricature image generation,this thesis studies the face to caricature generation algorithm based on the multi-scale feature fusion of composite generative adversarial networks.Aiming at solving the mismatch between regional details and style features in generating caricatures,this thesis introduces a multi-level generator and a multi-scale feature fusion module to generate and fuse global facial structures and facial region features? use a multi-level discriminator to perform corresponding discrimination works to match the generator? total variation regularization constraints and line continuity constraints further improve generation’s quality and enhance artistic effects.4.For the face to caricature video generation,this thesis studies the face to caricature video generation algorithm based on multi-scale supervised learning of key points and domain adaptation.Aiming at the problem of low accuracy of key point detection based on motion-driven methods,this thesis uses the inverted residuals and multi-scale feature fusion to extract and map features in depth,and obtains more accurate prediction results? using gradient reversal layer and the domain discriminator applies the motiondriven model trained on the face source domain to the caricature target domain,generating more realistic and stylized caricature videos. |