Font Size: a A A

Research On Image Captioning By The Method Of Generative Adversarial Networks

Posted on:2021-02-15Degree:MasterType:Thesis
Country:ChinaCandidate:H LiFull Text:PDF
GTID:2428330623467789Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Image captioning is an important task in the area of computer vision and aitificial intelligence.Depending on both visual and linguistic understanding,it generates descriptive sentences for the related image.The generated sentences should not only to accurately describe the image,but also be more natural for human reading.The traditional models only focus on the accuracy and fidelity of generated sentences,but lack the diversity and disctinctiveness,so the generated sentences are monotonous.In this paper,we use the method based on Generative Adversarial Networks(GAN)to solving the problem caused by Maximizing Likelihood Estimation(MLE).Because of the property of randomness in generating of GAN,our proposed model could generate more diverse and distinctive image captions.Moreover,to guarantee the accuracy of the generated captions,we use some external text data to train the Discriminator in our model.In our method,the external text data are captions that in the same semantic but in other language.Therefore,captions generated by our model are diverse and accurate.Our contribution are as follows: 1.We propose a novel model based on GAN,which use some external text data to train the discriminator,so the generated captions are diverse and accurate.2.Our model yields a new evaluation metric,which is stronger than other metrics in a comprehensive way.3.The resulst on various experiments show that our model consistently outperforms other traditional models.
Keywords/Search Tags:Image Captioning, Maximizing Likelihood Estimation (MLE), Generative Adversarial Networks(GAN), Reinforcement Learning(RL), Deep Learning
PDF Full Text Request
Related items