Text-to-image Algorithm Based On Generative Adversarial Network

Posted on:2024-09-22

Degree:Master

Type:Thesis

Country:China

Candidate:Y R Duan

Full Text:PDF

GTID:2568307157472734

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

The main work of text to image task is to convert descriptive text into visual pictures,Generative Adversarial Network(GAN)can realize the transformation from text to image,however,due to the complexity of the cross-modal,the current text to image algorithm based on generative adversarial network still has the problems of generating regions unrelated to text with low quality,generating details that are not good,and lack of mapping relationships between text and images in the initialization stage.The generator and discriminator network are reconstructed separately to address the shortcomings of the existing models.The details of the study are as follows:(1)In order to improve the quality of the generated images and to enhance the generation of detail parts of the network,a multi-level affine combination text to image network(AF-GAN)is proposed relying on the existing multi-stage network.For the former,the text-image affine combination module is added to improve the fine-grained properties of the generated images,strengthen the cross-modal connection between text and image,and record the features of the text-independent regions using bias terms so that the final output can be generated with high quality even for the text-independent regions.For the latter,a detail correction module is added to the network,mainly using word-level features combined with image information,and a spatial and channel attention mechanism module is used to focus on the main feature information to further enhance the details in the synthesized image,where an affine module is also added to refine the missing content of the generated image.(2)In order to solve the problem of lack of mapping relationship between text and image in the initial stage,and the lack of detailed feedback from the discriminator to the generator,an adaptive multi-cascade text to image network(AM-GAN)is proposed.There are three main improvements.First,the cross-attention coding structure is added in the initial image generation stage,and the text information is input to this encoder together with the image,and the crossattention features aligned with the image features are output,reflecting the mapping relationship between text and image to improve the quality of generated images;second,the normalization method uses instance normalization to improve the stability of the trained model;third,the discriminator network uses an adaptive discriminator that returns results to the generator allowing it to capture information about different image regions,thus allowing the generator to perform more detailed generation.For the above models,extensive experiments were conducted on the CUB and COCO datasets to evaluate the quality of the generated samples and to compare the evaluated values.The results show that the evaluation metric IS of the AF-GAN model compared to the previous optimal results the CUB dataset improves by 0.58 and FID decreases by 5.69;the AM-GAN model on the CUB and COCO datasets,the IS improves to 5.51 and 32.51,and FID decreases to 10.21 and 30.06.It proves the effectiveness of the proposed new algorithm.After that,the details of the generated images for each dataset are analyzed to further illustrate the feasibility as well as the superiority of the algorithm.

Keywords/Search Tags:

Text to Image, Generative Adversarial Network, Affine Combination Module, Detail Correction, Cross Attention

PDF Full Text Request

Related items

1	Research On Text Description Image Generation Based On Generative Adversarial Network
2	Research On Cross Modal Text Generation Image Based On Generative Adversarial Network
3	Research On Text-to-Image Synthesis Based On Generative Adversarial Network
4	Research On Text-to-Image Generation Technology Based On Generative Adversarial Networks
5	Research On Text To Image Technology Based On Generative Adversarial Networks
6	Research On Text-Guided Image Generation Based On Generative Adversarial Networks
7	Research On Text To Image Generation Algorithm Based On Attention Mechanism And Generative Adversarial Networks
8	Research On Generative Adversarial Network For Text-to-Image Synthesis
9	Generative Adversarial Network For Text-to-Image Synthesis
10	Research On Text Images Based On Generative Adversarial Network