Font Size: a A A

Resaerch And Implementation Of High-resolution Image Generation Based On Text Semantics

Posted on:2021-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:Y L CaiFull Text:PDF
GTID:2428330632962922Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,the information stored on the Internet is becoming more and more abundant.People can access massive amounts of visual information every day,but it is difficult to get the images they really need through search engines.The system of text-to-image synthesis can automatically generate images that match the text description entered by the user.Such a search method can better respond to the needs of human personalized taste.In 2014,the generative adversarial network was born,which opened a new chapter in image generation.The task of text-to-image synthesis has also attracted more and more researchers' attention.Its core challenge is to generate realistic,diverse,and semantically consistent images.In recent years,the technology for text-to-image synthesis has been continuously improved,but the results'are still not satisfactory,and there is still much room for improvement.As for the challenges and problems in text-to-image synthesis,the main work in this paper is summrized as follows:(1)Aiming at the problems of blurry,distorted global structure,and inconsistent details in the images generated by the existing algorithms,this paper proposes a novel dual attentional generative adversarial network for generating high-resolution image from text.The algorithm consists of two core network modules:one is a visual feature reconstruction network based on the dual attention model,which enhances local details and global structure by focusing on the features of related words and different visual areas.Another is a sub-pixel feature reconstruction network based on the inverted residual structure.It boosts the non-linear representation power by expanding the visual features' capacity of the residual middle layer.In addition,spectral normalization is introduced into both generator and discriminator to stabilize GAN training.The experimental results on multiple datasets show that the algorithm can generate more diverse and more realistic samples.(2)A single global discriminantor can cause overall image distortion due to overemphasis on certain biased local features.Therefore,this paper proposes a new algorithm for generating high-quality images based on image quality perception,which improves the quality of generated images through self-supervised learning.In auxiliary task,this paper proposes a ranker based on image quality perception.In terms of image quality,the ranker adopts siamese network to rank autonomously generated images with varing degrees of distortion.While training GAN for text-to-image systhesis,a new perceptual ranking loss is present to optimize the quality of the images generated by the generator.The experimental results on multiple datasets show that the algorithm can generate higher quality and more realistic samples.(3)A high-resolution image generation system based on text semantics is designed and implemencted,which crosses the "semantic gap" and achieves personalized image customization.The system is implemented with the B/S architecture and has functions of complete personalized image customization,rich visualization,and perfect interaction.
Keywords/Search Tags:text-to-image synthesis, generative adversarial network, attention, self-supervised learning
PDF Full Text Request
Related items