Font Size: a A A

Research On Image-to-Image Translation

Posted on:2021-03-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:J X LinFull Text:PDF
GTID:1368330602494260Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In the field of computer vision,due to the vigorous development of deep learn-ing in recent years and the accumulation of large-scale available visual data,various visual understanding methods have achieved rapid development.Visual understanding technology analyzes and understands the content of videos or pictures,and the goal is to acquire knowledge from real-world videos or pictures.At the same time,in the field of computer vision,there is also a research direction contrary to the goal of vi-sual understanding:visual generation,which aims to convert abstract representations into visual data.This dissertation focuses on the problem of image-to-image transla-tion in the task of visual generation,and aims to learn a mapping so that the image can be transformed from the source image domain to the target image domain.Image-to-image translation is a new research topic with wide influence,which includes many problems in computer vision,computer graphics and image processing.While image-to-image translation brings many opportunities to industry and academia,there are still many research-worthy and challenging problems that cannot be solved well by existing methods,including interpretability problems,generalization problems,training stabil-ity problems,etc.This dissertation has carried out a series of studies around the basic principles of image-to-image translation and the potential problems it faces in practical applications as follows:(1)Image-to-image translation method based on conditional input.A conditional image-to-image translation problem and a conditional dual generative adversarial net-work algorithm are proposed.By analyzing the transformation process of image fea-tures in image-to-image translation,it is proposed to use generative adversarial networks and dual learning to solve conditional image-to-image translation.Experimental results show that the proposed model can effectively transform images with conditional infor-mation and is robust to various tasks.(2)Image-to-image translation method based on hidden space disentanglement.A domain-supervised generative adversarial network is proposed,which makes the image translation model have interpretable hidden space disentanglement ability.By analyz-ing that existing methods lack effective usage of domain supervision information,a pre-train domain-specific feature extractor based on domain supervision is proposed.By disentangling two hidden features,namely domain-independent features and domain-specific features,a conditional domain-supervised generative adversarial network is fur-ther designed for conditional image-to-image translation.Experimental results show that this method can better disentangle the two features,and achieve the most advanced results in various image translation tasks.(3)Image-to-image translation method based on multi-path consistency.A new multi-path consistency loss function is proposed.The analysis shows that the image generated by the one-hop translation and the image generated by the two-hop transla-tion in the existing image translation model are not consistent.It is proposed that the difference between the direct translation from the source domain to the target domain and the indirect translation from the source domain to the auxiliary domain to the target domain should be minimized.This method can effectively regularize the training of image translation tasks,and reduce errors in image translation results,and improve the quality of image translation.This method achieves better performance than the existing multi-domain and two-domain image translation models on different datasets.(4)A few-shot image-to-image translation method based on meta-learning.A meta-translation generative adversarial network is proposed.By analyzing that exist-ing image translation methods lack the ability of keeping memory of historical learning experience,it is proposed to study the problem of unsupervised image translation from the perspective of meta-learning,so that the model can effectively utilize the learning experience in previous image translation tasks.The model includes a meta generator to retain previous translation experience,and a meta discriminator to teach the meta generator how to quickly generalize to new tasks.Experimental results show that the performance of the proposed meta-learning method is generally better than traditional image translation models,and the convergence speed is faster.(5)One-shot image-to-image translation method based on multi-scale structure.Considering that current image translation works usually require large amount of image data,an image translation method for two unpaired images is proposed.This method uses two pyramids of generators and discriminators to gradually refine the generated re-sults from global structure to local details.Experimental results show that the proposed method effectively solves the problem of one-shot image translation.(6)Zero-shot image-to-image translation method based on adversarial training.A zero-shot image-to-image translation problem is proposed,which aims to achieve im-age translation in the unseen image domains.By analyzing that different image domains have the associated information,it is proposed to use the domain-specific feature dis-tribution constrained by semantic consistency to model each seen/unseen domain.A visual semantic encoder and an attribute semantic encoder is instructed to make the se-mantic information be shared between the two modalities.Experimental results show that the proposed model can effectively solve the zero-shot image-to-image translation task.
Keywords/Search Tags:Generative Adversarial Network, Image-to-Image Translation, Feature Disentanglement, Meta-Learning, Few-Shot Learning
PDF Full Text Request
Related items