Foreground Image-to-Image Domain Translation Based On Self-attention Mechanism

Posted on:2021-04-28

Degree:Master

Type:Thesis

Country:China

Candidate:G L Wang

Full Text:PDF

GTID:2428330611951380

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Unsupervised image translation has been a hot topic in recent years.It transforms a domain expression of images into another domain expression on the premise of unpaired images.It has been widely used in image stylization,domain adaptive learning and other fields.It is difficult for current unsupervised image translation to focus on a single object without changing the background or the way that multiple objects interact in a scene.Faced with this problem,separating the foreground and background of the image is a big challenge.This thesis is based on an unsupervised image translation framework-MUNIT which decomposes the image representation into domain-invariant content code and style code that captures specific domain attributes.In order to solve the problem of existing algorithm in foreground image domain translation,two improvement schemes are proposed.Based on the study of U-net,taking advantage of the ability of U-net that can link the low-level features between the input image and the generated image,the U-net foreground acquisition network is proposed,which achieves the separation of foreground and background and retains the foreground and background information respectively,then translate the foreground.Inspired by the attention mechanism of the human brain,the basic idea of the attention mechanism in computer vision is to let the model learn how to ignore the irrelevant information and focus on the key information.Therefore,based on the self-attention in the spatial domain and the self-attention in the channel domain,a hybrid domain self-attention is designed and is added to the generator and the discriminator,weakening the background affected by the target domain and controlling the image details.Experimental verification is carried out on horse2 zebra and apple2 orange datasets,the MUNIT,UAGAN and CycleGAN are used for comparison.According to the results,there has been significant progress in subjective vision.Objectively,our models have been verified through IS and FID evaluation indicators.In addition,this thesis makes the experiments in human face dataset and cat dataset,using MUNIT,UNIT and CycleGAN as comparison,and verified that the self-attention structure can also control details with large geometric changes in both the sets,and establish context-dependent long-range dependent associations.

Keywords/Search Tags:

Image-to-Image Translation, Unsupervised Learning, Generative Adversarial Network, Self-attention Mechanism

PDF Full Text Request

Related items

1	Research On Unsupervised Image-to-Image Translation With Generative Adversarial Networks
2	Research On Unsupervised Image-to-Image Translation Based On CycleGAN
3	Unsupervised Attention Guided Image Dehazing Research
4	Improved Generative Adversarial Networks For Image-to-Image Translation
5	Research And Implementation Of Image Translation Based On Adversarial Learning
6	Face Semantic Translation Using Unsupervised And Semi-supervised Learning Based Generative Adversarial Networks
7	Image Translation Based Generative Adversarial Networks
8	The Research Of Image Translation Model Based On Generative Adversarial Networks
9	Research On Image-to-image Translation With Adversarial Learning
10	Research On Attention Model In Image Classification And Image-to-Image Translation