Research And Application Of Image Captioning Method

Posted on:2021-02-16

Degree:Master

Type:Thesis

Country:China

Candidate:C R Long

Full Text:PDF

GTID:2428330614460387

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Image captioning aims to translate an image into a complete and natural sentence.It involves both computer vision and natural language processing.On the one hand,although image captioning has achieved good results under the rapid development of deep neural networks,excessively pursuing the evaluation results of the captioning models makes the generated text description too conservative in practical applications.It is necessary to increase the diversity of the text description and account for prior knowledge such as the user's favorite vocabularies and writing styles.On the other hand,image captioning often requires a large set of training image-sentence pairs.Therefore,how to reduce the dependence on the image-sentence pairs dataset,learn the domain variance between different datasets,and use other available data annotations to train the image captioning model well becomes more and more important.However,in practice,acquiring sufficient training pairs is always expensive,making the recent captioning models limited in their ability to describe objects outside of training corpora(i.e.,novel words).Regarding the problems of personalization,domain variance,and novel words in the task of image captioning,the main works of the dissertation are as follows:(1)This dissertation proposes the image captioning that can generate sentences,using the most preferred word expressions to describe the user's own story and life experience.The proposed method can flexibly model user interests by embedding user IDs as interest vectors.By modeling the unique information of each user,such as image features,user ID,and user content,the user's characteristic interest vector is constructed.The user interest vector combined with the top-down attention mechanism can better guide the training of language models and generate text description sentences that conform to the user's style.The effectiveness of this method has been verified on the data sets of Instagram and Lookbook platforms.(2)This dissertation proposes to use simple and effective domain-invariant constraints to learn cross-domain text description generation models that can be applied to different data platforms.By constructing effective domain constraints with distance measurement as the core for the model,the domain offset between the sentence-level features of the source and target domains can be minimized in the hidden space,and the shared subspace features can be learned.The domain shared dictionary method proposed at the same time aims to enrich sentence generation in different data domains.To further study the private data characteristics of different data domains,this dissertation also proposes to use the domain classifier mechanism to guide the language model to generate text sentences for specific data domains.The experimental results prove the effectiveness of the method.(3)This dissertation presents the application of a language model that incorporates a replication mechanism to food analysis data sets.The model can directly "copy" the appropriate vocabulary in the candidate words generated by the picture(including some novel words that never appeared in the paired picture text data set)to construct the output sentence,so as to realize the description generation of the novel words.Byembedding the replication mechanism in the traditional end-to-end sequence generation model and assisting in the effective target detection model,it helps the language model learn to generate novel words descriptions.The experimental results prove the effectiveness of the method.

Keywords/Search Tags:

image captioning, domain adaptation, personalization, novel words

PDF Full Text Request

Related items

1	Domain Adaptation And User-Transfer Based Personalization Applications
2	Style-based Cross Domain Image Captioning Technology
3	Research On Domain Adaptation Image Segmentation Based On GAN
4	Image Classification Algorithm Method Based On Unsupervised Domain Adaptation
5	Research Of Domain Adaptation Methods Based On Cross-Domain Regularization Model
6	Research On Deep Hashing Search Algorithm Based On Domain Adaptation
7	Unsupervised Domain Adaptation Research Based On Domain Relation Utilization
8	Research Of Domain Adaptation In Image Classification Based On Domain Invariant Feature
9	Graph Embedding Based Visual Domain Adaptation
10	Graph Adversarial Domain Adaptation For Non-shared-and-Imbalanced Transfer Learning Via Hierarchy Graph Reasoning