Font Size: a A A

Image Style Translation Based On CycleGAN

Posted on:2020-08-10Degree:MasterType:Thesis
Country:ChinaCandidate:P PengFull Text:PDF
GTID:2428330596476547Subject:Engineering
Abstract/Summary:PDF Full Text Request
Image-to-image translation is widely used in reality,so it has attracted great attention in the field of computer vision.Deep learning includes supervised learning,unsupervised learning and semi-supervised learning.Image translation algorithms based on supervised learning are widely used in super-resolution enhancement,image completion,image style translation and other fields.Most of these algorithms adopt the coding-decoding architecture based on deep convolution network,which is very effective,but requires a large number of paired training data,which greatly limits the scope of its application.Generative adversarial network(GAN)is a new technology in the field of deep learning in recent years.The application of GAN has been extended to many fields such as video,image,text,and voice,especially in the field of image generation and image style translation.CycleGAN is an image style translation technology proposed by Jun-Yan Zhu and others at the University of California,Berkeley.Through the network structure of CycleGAN,the image style translation is performed,and the naturally-acquired picture is transformed into a picture with a certain style.At the same time,the method does not require the source image and the style image to match each other,thereby expanding the application range.However,there are still some areas for improvement in the CycleGAN model.This paper is based on the CycleGAN model for further research work.In this thesis,we study the CycleGAN framework of unsupervised learning image style translation model based on the generative adversarial network.To overcome the shortcomings of the original model,we propose a better improved CycleGAN network model.First,in the experiment,we used WGAN-GP,WGAN,LSGAN and the original GAN objective function to compare the quality of the generated samples.It is found that WGAN-GP can stabilize the training process and generate more realistic images,so we use WGAN-GP and WGAN loss to replace LSGAN and GAN loss in some objective functions.Second,in order to enhance the structural similarity between the generated image and the original image,we increase the MS-SSIM loss in the cycle consistency loss.Then,based on the skip-connection structure in U-Net and residual block principle,we use the residual blocks and skip-connection to increase the multi-scale invariance in the generator network part.Finally,we use the multi-scale dilated discriminator proposed in this thesis in the discriminator network part to improve the effect of spatial geometry transformation and high resolution image generation in image style translation.The improved model shows better translation effect in qualitative and quantitative.In view of the fact that CycleGAN model has a good translation effect in color and texture,but it has a bad effect in geometric shape transformation,the improved CycleGAN model proposed in this thesis has improved greatly in this problem.Finally,based on the improved CycleGAN model proposed in this paper,the influence of common occlusion glasses on face recognition is solved,and the application of the improved CycleGAN model in the glasses removal task is realized.
Keywords/Search Tags:deep learning, generative adversarial network, CycleGAN, image style translation
PDF Full Text Request
Related items