Font Size: a A A

Research And Applications Of Text To Image Synthesis Based On Generative Adversarial Networks

Posted on:2020-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:H Y WuFull Text:PDF
GTID:2428330578472102Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Text to image synthesis is a popular research topic which combines the field of natural language processing and computer vision.The ultimate goal of this problem is to generate realistic images which are full of details according to text descriptions.It has tremendous applications including automatic mapping and computer aided design when fully automatic synthesis system are available.At the same time,the research of this problem is also facing many challenges.Recently,deep learning algorithms have shown impressive results in the field of natural language processing and computer vision.Most of the image understanding tasks such as image classification and target detection are coarse-grained.In comparison,the result distribution of text to image synthesis is highly multi-modal.It is not only necessary to identify objects and learn their attributes,but also necessary to learn the features of positional relationship between objects and even the motion of objects.This conditional multi-modality is a very natural application for Generative Adversarial Networks(GAN)thus we introduce it to solve these problems.In this paper,we introduce two representative model structures of text to image synthesis in detail.Based on the existing research,we propose a method of text to image synthesis based on GAN and the corresponding experiments are carried out.The major innovative works of this paper are as follow:1.This paper proposes SG-Stack algorithm.SG-Stack uses scene graphs as input which are obtained from the text descriptions to generate images,so that it can learn more about the characteristics of positional relationship between objects.SG-Stack adopts stacked structure and divides the complex problem into two stages,the difficulty of each stage is reduced compared with the original problem,and the final images contain more features.2.We use the PyQt toolkit to develop a system of text to image synthesis based on GAN.The system realizes the main function of model training and text to image synthesis.At the same time,the system also provides the function of training parameter adjustment and the visual display function of the training process to help users complete model training by themselves.
Keywords/Search Tags:text to image synthesis, Generative Adversarial Networks, scene graph, stacked structure
PDF Full Text Request
Related items