Generating Image Captions From Structural Words

Posted on:2018-05-04

Degree:Master

Type:Thesis

Country:China

Candidate:S B Ma

Full Text:PDF

GTID:2348330542484891

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the continuous development of artificial intelligence technology in the field of multimedia.Generating semantic descriptions for images becomes more and more prevalent in recent years.Sentence which contains objects with their attributes and activity or scene involved is more informative and able to express more details of image semantic and easy to understand.Based on the above requirements and artificial intelligence technology,in particular,deep learning technology become more and more prevalent.In this paper,we focus on the generation of descriptions for images according to the structural words we have generated,i.e.,a tetrad of <object,attribute,activity,scene>.Therefore,we use a two-step framework to generate informative image descriptions.At the first step,we propose to use a multi-task learning method to recognize structural words <object,attribute,activity,scene>.At the next step,taking the words sequence as source language,we train a LSTM encoder-decoder machine translation model to output the target caption.Especially,the description is composed of objects with attributes,such as color,size,and corresponding activities or scenes involved.Meanwhile,in order to demonstrate that using multi-task learning method to generate structural words is effective,we do experiments on benchmark datasets,i.e.,aPascal and aYahoo.We also use UIUC Pascal,Flickr8 k,Flickr30k,and MSCOCO datasets to justify that translating structural words to sentences achieves promising performance compared to the state-of-the-art methods of image captioning in terms of language generation metrics.

Keywords/Search Tags:

Image Description, Structural Word, Multi-task Learning, LSTM, Machine Translation

PDF Full Text Request

Related items

1	Research On Chinese-Mongolian Neural Machine Translation Based On Multi-task Learning
2	Research On Modal Adaptation For Image Description Translation
3	Deep Learning-based Machine Translation Research In China And Malaysia
4	Research On Neural Machine Translation Technology Based On Deep Learning
5	Research On Chinese Word Segmentation Based On Machine Translation Technology
6	Low-Resource Machine Translation Techniques For Distant Language Pair
7	Research On End-to-end Neural Network Machine Translation
8	The Research On English-Chinese Name Entity Translation
9	Two Direction Machine Translation Based On Sentence Semantic Embedding And Its Evaluation
10	Study On Word Alignment Technology And Construction Of Statistical Machine Translation Platform