Research On Text-to-image Generation Based On Multi-stage And Multi-task Generative Adversarial Network

Posted on:2022-09-30

Degree:Master

Type:Thesis

Country:China

Candidate:Y T Xue

Full Text:PDF

GTID:2518306605465794

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

Image generation based on deep learning is one of the researches that have received much attention in computer vision.Text-to-Image Generation is used to obtain images that match the given text using image generation technology,which belongs to the cross domain of computer vision and natural language processing.Due to the differences between the two modalities,the task faces many difficulties and challenges.On the one hand,as the mainstream generative model of this task,the Generation Adversarial Network has the problems of unstable training process,low quality of the generated results,and insufficient diversity.Although some researchers guide the generation process step by step through the multi-stage network,which enhances the stability of training,the cascade structure will cause error accumulation and affect the generated image by the initialization quality.On the other hand,Text-to-Image Generation is a weakly supervised task,and the supervision of the Generation Adversarial Network is biased,which cannot guarantee the semantic consistency of the image and the text.In order to generate high-quality,large-size images,this article explored the following two issues and achieved good results.(1)For the cumulative error of multi-stage Generative Adversarial Network and the affecting by the initialization quality,this paper simulates forgetting behavior through the dual structure of the down-up sampling layer to make the network learning more intelligent.At the same time,this paper modifies the generation process of the test image and regenerates the poor-quality image to ensure the quality of the initial image by introducing a discriminator.The above two methods are integrated into a Multi-stage Adaptive Generative Adversarial Network.Qualitative and quantitative experiments on CUB and Oxford data sets show that the performance of this model is better than others.(2)For the problem of biased guidance in the Text-to-Image Generation,this paper combines multi-task learning strategies to add additional auxiliary task to the discriminator to make the guidance more refined,and effectively improve the semantic consistency between the image and the corresponding text.Experiments show that the Multi-stage and Multiguidance Adaptive Generative Adversarial Network designed in this paper has better performance on CUB and Oxford datasets compared to the network using a single-task training strategy.

Keywords/Search Tags:

Generative Adversarial Network, Multi-task Learning, Text-to-Image Generation, Multi-stage, Cross Domain

PDF Full Text Request

Related items

1	Research On Text Description Image Generation Based On Generative Adversarial Network
2	Research Of The Cross-domain Image Understanding Based On Generative Adversarial Neural Networks
3	Research And Applications Of Generative Adversarial Network Based On Complete Representation Learning For Multi-view Face Image Generation
4	Learning Based Compression Artifact Removal And Face Image Generation With Generative Adversarial Networks
5	Research And Application On Particular Scene Generation Based On Generative Adversarial Network
6	Research On Text-to-Image Generation Based On Generative Adversarial Network
7	Multi-task Image Classification Method Based On Sample Generation
8	Research On Chinese Text Generation Based On Generative Adversarial Networks
9	Research On Text Generation Based On Generative Adversarial Network
10	Conditional Image Generation Method Based On Generative Adversarial Network