Font Size: a A A

Research On Cross-domain Image Translation And Its Application In Face Caricature Generation

Posted on:2021-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:H D HouFull Text:PDF
GTID:2428330647451044Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Cross-domain image translation aims at translating images from one domain to another domain,while keeping image's content unchanged.Face caricature generation is one of the typical application of cross-domain image translation,where images from photo domain are translated to images in caricature domain,with the identity being preserved.In this paper,we focus on the fine-grained category preserving problem in cross-domain image translation and the geometric exaggeration variety problem in face caricature generation.We propose Cross-Domain Adversarial Auto-Encoder(CDAAE)and Multi-Warping Generative Adversarial Nets(MWGAN)to solve the two problems respectively.Cross-domain image translation is one of the basic tasks in the field of computer vision.With the development of deep neural networks(DNN),many DNN-based image translation methods have been proposed.However,these methods have some limitations.One is that existing methods mostly treat image translation as a one-to-one mapping,and the other is that they tend to preserve all the geometric structures and only change the texture when translating images.Therefore,we focus on the fine-grained category preserving problem of cross-domain image translation in our first work,and propose the Cross-Domain Adversarial Auto-Encoder.We assume that images can be disentangled into a content code and a style code,so that the model can learn a manyto-many mapping between images from different domains.Besides,in order to make the model capture fine-grained category information,we also adopt category distribution on the content code and train the model semi-supervisedly.Experiments show that CDAAE achieves better image diversity and fine-grained category preservation.In ad-dition,we also design a domain adaption algorithm based on CDAAE,and achieves state-of-the-art accuracy on benchmark datasets.Automatically face caricature generation methods have been researched for a long time.Attention is paid to both traditional methods and DNN-based methods.Exaggerating facial shapes is essential to caricature generation,and existing methods do this only from the aspect of emphasizing the characteristic of the input face.However,the diverse art forms and colorful emotions leads to the diversity of exaggeration styles in real caricatures.Therefore,based on the assumption of shared content latent space and separated style latent space of our first work,our second work focuses on the diversity of exaggeration styles in caricature generation,and propose Multi-Warping GAN.Our method generates caricatures through two steps: style translation and geometric exaggeration,which are controlled by a style code and a landmark transform code respectively,so that the model can generate caricatures with diverse color style and various exaggeration style,given a certain input photo.Besides,we also design the model in a dual-path way to learn the bidirectional mapping between latent codes and real images.In this way,the model can learn much more meaningful color styles and geometric exaggerations.In addition,in order to preserve the identity of the input face,we apply face recognition loss on both image and landmark space.Experiments show that MWGAN can generate caricatures with various color and exaggeration styles,which are also more realistic than caricatures generated by previous methods.
Keywords/Search Tags:Cross-domain image translation, Face caricature generation, Generative adversarial networks, Auto-encoder
PDF Full Text Request
Related items