Font Size: a A A

Research On Generation Of Bilingual Image Captions

Posted on:2021-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:K ZhangFull Text:PDF
GTID:2428330605974901Subject:Software engineering
Abstract/Summary:PDF Full Text Request
End-to-end neural network-based approaches have become the mainstream in image caption generation.Most relevant studies are focusing on generating captions in a single language(e.g.,English)and have achieved good performance.However,in many scenarios we need to describle an image with different languages,as to let people with different native languages understand the same image.Therefore,it is necessary to generate captions with two or more languages for an image.This paper focuses on image captioning with two languages via pivot language,joint two language features,and joint self-attention and recurrent network.The main contents include:Image captioning based on pivot language.In the scenario where there is no Chinese image captions,this thesis explores zero-resource image caption to generate Chinese captions via English as the pivot language.Specifically,this thesis proposes two approraches by taking the advantage of recent advances in neural machine translation.The first approach,called pipeline approach,first generates English caption for a given image,then it translates the English caption into Chinese.The second approach,called building pseudo-training set approach,first translates all English captions in training sets and development set into Chinese to obtain image-Chinese caption datasets,therefore it then could directly train a model to generate Chinese caption for a given image.(2)Joint generation of bilingual image captions.In the scenario where their exists bilingual image captions corpora,motivated by the fact that the two captions of an image are semantically equivalent,this thesis proposes a joint model to generate bilingual image captions.Specifically,the two decoders generate image captions in alternative way,making the decoding history information of two languages are both available to predict the next word.(3)Research on generation of bilingual image captions via joint self-attention and recurrent neural network.In order to generate bilingual image captions,the decoder uses a recurrent neural network to model the features of the image and the interaction of the caption,while ignores the self-attention of the internal interaction of the image or the caption modal.This thesis explores a model that can joint the advantages of recurrent network and self-attention network to generate image caption is presented.And use reinforcement learning to optimize the gradient strategy to further improve performance.Based on publicly avaiable dataset of image caption,the experimental results show that the proposed approaches in this paper significantly improve the performance of image caption.
Keywords/Search Tags:bilingual image caption, pivot language, neural network, reinforcement learning
PDF Full Text Request
Related items