Font Size: a A A

Foreground Image-to-Image Domain Translation Based On Self-attention Mechanism

Posted on:2021-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:G L WangFull Text:PDF
GTID:2428330611951380Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Unsupervised image translation has been a hot topic in recent years.It transforms a domain expression of images into another domain expression on the premise of unpaired images.It has been widely used in image stylization,domain adaptive learning and other fields.It is difficult for current unsupervised image translation to focus on a single object without changing the background or the way that multiple objects interact in a scene.Faced with this problem,separating the foreground and background of the image is a big challenge.This thesis is based on an unsupervised image translation framework-MUNIT which decomposes the image representation into domain-invariant content code and style code that captures specific domain attributes.In order to solve the problem of existing algorithm in foreground image domain translation,two improvement schemes are proposed.Based on the study of U-net,taking advantage of the ability of U-net that can link the low-level features between the input image and the generated image,the U-net foreground acquisition network is proposed,which achieves the separation of foreground and background and retains the foreground and background information respectively,then translate the foreground.Inspired by the attention mechanism of the human brain,the basic idea of the attention mechanism in computer vision is to let the model learn how to ignore the irrelevant information and focus on the key information.Therefore,based on the self-attention in the spatial domain and the self-attention in the channel domain,a hybrid domain self-attention is designed and is added to the generator and the discriminator,weakening the background affected by the target domain and controlling the image details.Experimental verification is carried out on horse2 zebra and apple2 orange datasets,the MUNIT,UAGAN and CycleGAN are used for comparison.According to the results,there has been significant progress in subjective vision.Objectively,our models have been verified through IS and FID evaluation indicators.In addition,this thesis makes the experiments in human face dataset and cat dataset,using MUNIT,UNIT and CycleGAN as comparison,and verified that the self-attention structure can also control details with large geometric changes in both the sets,and establish context-dependent long-range dependent associations.
Keywords/Search Tags:Image-to-Image Translation, Unsupervised Learning, Generative Adversarial Network, Self-attention Mechanism
PDF Full Text Request
Related items