Research On Text-oriented Image Editing Algorithm

Posted on:2024-03-16

Degree:Master

Type:Thesis

Country:China

Candidate:Z Y Chen

Full Text:PDF

GTID:2568307124971909

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the development of deep learning,image synthesis has attracted more and more attention,and its application fields have become more and more extensive.In this paper,the synthesis of text to image is studied in depth,that is,learning how to guide the generation of corresponding images through text description.It is required that the generated images not only have authenticity and diversity,but also match the image content with the given text description.Through the study of existing text-to-image models,this paper proposes the following two models:1.proposes TMGAN a text-oriented image manipulation model based on the generation adversarial network,the generator adopts the Transformer encoding and decoding structure to extract global context information which can solve the problem that the generated images were not realistic enough;the discriminator contains two parts,a transformer-based multi-scale discriminator and a word-level discriminator,to give the generator more refined feedback to generate targets image,which simultaneously meet the text requirements and the content of the original image that has nothing to do with the text description.The experiment shows on the public data sets CUB Bird data sets,IS(Inception Score),FID(Fréchet inception distance)and MP(Manipulation precision)metrics reached 9.07,8.64 and 0.081 respectively.The proposed method better than the advance methods,the generated image not only meets the attribute requirements of the given text description but also has high semantics.2.proposes TBMGAN a text-oriented image manipulation model based on multi-stage generation of confrontation network.It generates high-quality and high-resolution images step by step through two stages.In the first stage,it improves the quality of image generation by integrating more perceptual information,and in the second stage,it extracts image features adaptively through dynamic memory.At the same time,the model uses Bert text encoder to process text information,Transformer to replace the convolutional network to obtain context information,and the discriminator integrates the word-level discriminator to give the generator more fine-grained feedback.The experiment shows that the measurement indexes of IS(Inception Score),FID(Fréchet inception distance)and MP(Manipulation precision)on the CUB Bird data set reach 9.07,9.15 and 0.085 respectively,while on the COCO data set they also reach 27.88,15.21 and 0.072 respectively.The proposed method better than the advance methods,the generated image can not only generate images with high semantic and high quality that match the input text description on simple data sets,but also perform well on COCO data sets of complex scenes.

Keywords/Search Tags:

text-to-image synthesis, generation adversarial network, Transformer, word-level discriminator, Bert text encoder

PDF Full Text Request

Related items

1	Research On Text-Guided Image Generation Based On Generative Adversarial Networks
2	Text Adversarial Examples Based On Word-Level Perturbation
3	Research On Text-to-image Synthesis Algorithm Based On Generating Adversarial Network
4	High-Resolution Realistic Image Synthesis From Text Description Using Iteratively Generative Adversarial Network
5	Research On Text To Image Synthesis Algorithm Based On Stacked Generative Adversarial Networks
6	Research On Text To Image Synthesis Based On Generative Adversarial Networks
7	Algorithm And Application Of Text Classification Based On Transformer
8	Research On Generative Adversarial Network For Text-to-Image Synthesis
9	Resaerch And Implementation Of High-resolution Image Generation Based On Text Semantics
10	Research On Text Guided Image Generation Method Based On Adversarial Learning