Font Size: a A A

Unpaired Image-to-Image Translation

Posted on:2021-03-17Degree:MasterType:Thesis
Country:ChinaCandidate:X H WuFull Text:PDF
GTID:2428330623968546Subject:Engineering
Abstract/Summary:PDF Full Text Request
Image-to-image translation is a type of visual and graphical problem.The goal is to learn the mapping between input and output images by using a set of matched images.Image-to-image translation technology has a wide range of applications,such as superresolution,image stylization,picture editing,etc.Traditional image-to-image conversion technology mainly uses paired image pairs for learning.However,such paired image data needs to be manually labeled,and requires a large labor cost,which limits the size of the dataset.Unpaired data is widely present in daily life.It has higher practical value,but brings higher difficulty in implementation.This dissertation mainly studies imageto-image translation technology based on unpaired data.Through the improved analysis of the existing methods,the efficiency of image translation and the effect of generating pictures are greatly improved.At present,the research on unpaired image-to-image translation is mainly divided into two categories,dual-domain image translation and multi-domain image translation.In dual-domain image translation task,the existing research is mainly based on the dual consistency theory,which uses two neural networks to learn the two translation directions.This method directly translate in the pixel space.The generated image lacks texture information,and the method does not utilize the basic attributes shared between the images.In multi-domain image translation task,some recent work has introduced a condition variable to control the translation direction,but it is difficult to produce pictures that belong to two different domain attributes at the same time,or to control the degree of attribute expression.For example,in application scenarios,it is often necessary to simultaneously generate pictures with the attributes of beauty and smile,or to control the degree of beauty in the generated image.Based on the above analysis,this dissertation proposes two improvements.In the dual-domain image translation framework,a deep shared spatial structure is proposed.The images are extracted through the shared VGG network with hierarchical representations of semantic information,and then fused by generating adversarial networks.For multi-domain image translation,this dissertation proposes a network structure with two discriminators,which discriminate the generated image at the hidden layer and the pixel layer,respectively,so that the image can generate images with multiple domain attributes simultaneously,and the attributes can be controlled the degree of expression in the generated image.The improved models have been evaluated in detail on common data sets and have shown promising results.
Keywords/Search Tags:generation adversarial network, unpaired image-to-image translation, multi-domain image translation, image stylization
PDF Full Text Request
Related items