Font Size: a A A

Research On Algorithm Of Text To Image Based On Generative Adversarial Network

Posted on:2022-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y K YinFull Text:PDF
GTID:2518306569488574Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,images have become the customary form of expressing and storing information.The number of images is growing rapidly,and their contents and forms ar e gradually diversified.The main way for people to obtain images is to search in the search engine,but the diversity and complexity of images make people often can not find the pictures they really need.In order to enable the computer to achieve the magic function of generating corresponding images according to requirements,the application of generating corresponding pictures through words has been widely concerned.Generating corresponding images according to text description is a cross task of computer vision and natural language processing.The specific goal is to give a word to the computer,let the computer understand the semantic information in it,so as to generate the image information matching with the text.Generating and generating images from a given text description needs to achieve two goals:(1)the visual sense of the image is realistic enough;(2)the consistency of the image content and the corresponding text semantics.Although generated against network in the generated images that have high resolution and high sense of reality has made significant breakthrough,but applied to generate against a network through a text description to generate the corresponding image task,because between text and image corresponding to the diversity and instability in the process of training,this task also has a very big challenge.In order to improve the quality of generated images and the stability of training process,the following work is done in this paper:In order to improve the quality of image generated by text-to-image model,a spatial self attention generation countermeasure network model using semantic and spatial information to increase global and local attention is proposed.In addition to the sentence text features and random vectors,the input of the generation network also adds word level features as constraints.In order to make the overall layout of the generated image clearer,the spatial self attention module is introduced,which retains the spatial information of semantic annotation,so that the generator pays more attention to the overall layout of the image and standardizes the overall position of the object when generating the image,It plays a positive role in optimizing the generator.Experiments show that this method successfully improves the quality of the generated image.To verify the effectiveness of the SEAGAN proposed in this paper and enable the generation network to be better applied in life,the Glasses-1 dataset proposed in the cooperation project with Danyang Glasses Company was used in this paper.The Glasses-1contains 3123 images of different types of Glasses.By training the text-to-image model proposed in this paper,finally SEAGAN can generate clear glasses images with fine granularity through text description,which reduces the cost of glasses design and image shooting for the project,the generated images of glasses also reaches the clarity required by the product,these all prove the effectiveness of SEAGAN in the industrial field.
Keywords/Search Tags:Generative adversarial network, Text to images, Self-attention, Glasses-1
PDF Full Text Request
Related items