Font Size: a A A

Research And Application Of Intelligent Image Caption

Posted on:2020-08-07Degree:MasterType:Thesis
Country:ChinaCandidate:L T HeFull Text:PDF
GTID:2428330596975094Subject:Information security
Abstract/Summary:PDF Full Text Request
Image caption is a technology that could understand the content of pictures and output it with human descriptions.It studies the objects' location,relationship in the image.Image caption has broad prospects in search based on intelligent image,description for visual scene and so other fields.To improve the accuracy and diversity of image caption,a network called C-CGAN which represents Comparative-Conditional Generative Adversarial Networks is proposed.On the basis of CGAN,C-CGAN is put forward with the concept of “comparison”.The concept “comparison” is mainly embodied in two aspects.Firstly,when scoring the relevance between the image and the descriptions,the discriminator will compare the assessed-sentences with all the descriptions which including machine-descriptions,human-descriptions and unrelated-descriptions.Secondly,during the back-propagation of discriminator,the loss value of unrelated-descriptions is added to the loss function.By restraining the loss value of unrelated-descriptions,the correlation between descriptions and corresponding image is further improved,which means,the accuracy of machine-description sentences.According to the training flow of C-CGAN,we analysis the requirement and fulfill the network,which include model training,verification-set testing and image caption.During the model training,we introduce the policy gradient to take back-propagation,and Monte Carlo rollout to get instant feedback.The C-CGAN is applied to the standard dataset MSCOCO,Flickr 30 K and the self-made dataset PANDA.Finally,we get the image caption results.CGAN model is introduced as the baseline model.It is compared with C-CGAN.In the end,the results show that C-CGAN performs well.It improves the diversity while maintaining the accuracy of image caption on three datasets.Based on the broad prospect of image caption,increasing the diversity could deepen machine's understanding skills and widen its scope of vocabulary and structure.
Keywords/Search Tags:Image Caption, Policy Gradient, Monte Carlo Rollout, Comparative-Conditional Generative Adversarial Networks
PDF Full Text Request
Related items