Font Size: a A A

Study On Image Generation Based On Attention Mechanism

Posted on:2021-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2428330611488428Subject:Control engineering
Abstract/Summary:PDF Full Text Request
The development of deep learning has greatly promoted the research of image generation.In the traditional image generation task based on deep learning,due to the local connection in CNN,by learning local features,the model can generate the texture style information extracted by shallow neural network in the images,but the learning ability of high-level semantic features extracted by deep neural network is poor,resulting in fuzzy and distorted semantic objects in the generated images.In order to improve the global features processing ability of neural network and make the semantic objects in the generated images clearer and more real,this thesis introduces attention mechanism in the image generated model of cascaded refinement network,strengthens the global consistency between the multi-dimensional features in the network,and improves the quality of generating images from semantic labels and complex text descriptions.The main research contents and results are as follows:(1)By introducing the self-attention mechanism into the cascaded refinement network,make features fusion of multi-dimensional feature maps output by the first refinement module.Get self-attention features with global information,which overcomes the local feature defects caused by the local connection of convolutional neural network.Obtain the self-attention cascaded refinement network,and improve the clarity and realness of the semantic objects in the generated images from semantic labels.Through the semantic segmentation of generated images from the semantic labels of Cityscapes validation dataset,the pixel accuracy of self-attention model generated images is higher 6.2% than that of original model,and the mIoU average precision is higher 22.3% than that of original model.(2)On the basis of self-attention mechanism,combined with the input characteristics of cascaded refinement network,make the multi-dimensional feature maps output by the first refinement module and the multi-dimensional semantic features in the semantic labels features fusion.Propose supervised attention mechanism,and obtain the supervised attention cascaded refinement network.Through the semantic layouts and semantic structures guiding the output features of shallow network to generate images,and further improve the quality of generated images from semantic labels.The pixel accuracy of semantic segmentation of generated images from semantic labels of Cityscapes validation dataset is higher 2.4% than that of self-attention model,and the mIoU average precision is higher 4.4% than that of self-attention model.(3)Combined with the switchable normalization with good multi-feature robustness to improve the instance segmentation model Mask Scoring R-CNN,improves the accuracy of instance segmentation.Through the instance segmentation of the generated images,overcomes the misjudgment of the fuzzy semantic group objects by semantic segmentation.The quantitative results show that the two attention model can improve the quality of the generated images.(4)By introducing the supervised attention cascaded refinement network as generator into the Sg2 im,improves the quality of generated images by the model from the complex text descriptions represented by the scene graphs.Compared with the original Sg2 im,the Inception score of generated images from COCO dataset is improved by 19.4%,the Inception score of generated images from Visual Genome dataset is improved by 19.7.In summary,this thesis introduces self-attention and supervised attention mechanism into the cascaded refinement network image generated model to enhance the global consistency of multi-dimensional features of network outputs,and to overcome the phenomenon of fuzzy and distorted semantic objects in the generated images caused by local connection of CNN.Through the experiments of generating images from semantic labels and complex text descriptions,the qualitative and quantitative results show that attention mechanism can improve the quality of image generation.
Keywords/Search Tags:attention mechanism, image generation, cascaded refinement network, semantic label, scene graphs
PDF Full Text Request
Related items