Font Size: a A A

Research On Text To Image Synthesis Algorithm Based On Stacked Generative Adversarial Networks

Posted on:2022-06-28Degree:MasterType:Thesis
Country:ChinaCandidate:X WangFull Text:PDF
GTID:2518306530972269Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
In recent years,text-to-image generation has been an important hotspot in the field of computer vision and natural language.The purpose of this task is to take a descriptive sentence of language text as input,and then output an image that matches the content of the text.With the emergence of generative adversarial network in deep learning,the task of text generation image has been rapidly developed.However,due to the use of the generative confrontation network,there will be unstable training such as gradient disap-pearance and mode collapse during model training,and may cause the final generated result to be inconsistent with the text semantics or the generated content is not diverse.Therefore,on the basis of previous research,this paper proposes two text-generated image algorithms based on hierarchical generative adversarial networks,which not only improves the stability of network training,but also improves the clarity of the image,so that the generated images are more real and natural.The main work is summarized as follows:(1)Proposing a text-generated image network model combined with spectral normalization(SN-StackGAN).First of all,the network model applies spectral normalization to the two-stage discriminator network,and by constraining the Lipschitz constants of each layer of the discriminator network,it achieves the purpose of relatively slowing down the convergence speed of the discriminator network,thus making the network training more stable.Then performing experimental verification on the Oxford Flowers data set and MS coco data set,and using Inception scores to evaluate the quality of the generated images.The experimental results show that compared with StackGaN,SN-StackGaN has better stability and better image quality in network training.(2)Proposing a text-generated image network model combined with perceptual loss function.With the goal of improving the clarity of the generated image,on the basis of SN-StackGAN,a perceptual loss function is added to the two-stage generator network to enhance the consistency of the text content and the generated image.The network model was verified on the Oxford Flowers dataset and the Birds dataset,and the Inception score was used to evaluate the quality of the generated images.The experimental results show that compared with SN-StackGAN,the images generated by the network model are more realistic.
Keywords/Search Tags:Text generation image, generation of confrontation network, spectrum normalization, perceptual loss function
PDF Full Text Request
Related items