Research On Image Semantic Caption Generation Based On Encoder-Decoder Framework

Posted on:2019-12-31

Degree:Master

Type:Thesis

Country:China

Candidate:X P Chen

Full Text:PDF

GTID:2428330545485849

Subject:Photogrammetry and Remote Sensing

Abstract/Summary:

PDF Full Text Request

Recently,image semantic caption generation has received increasing attention as a fundamental research problem in artificial intelligence.This tecnique works as a bridge which connects the image processing technique in computer vision and sequence generation in natural language processing.Generating descriptions of images automatically is very useful in practice,for example,it can help visually impaired people understand image contents and improve image retrieval quality by discovering salient contents.Much advance has been made in image captioning,and an encoder-decoder framework has achieved outstanding performance for this task.In this paper,we propose a novel architecture,namely Auto-Reconstructor Network(ARNet),which,coupling with the conventional encoder-decoder framework,works in an end-to-end fashion to generate captions.ARNet aims at reconstructing the previous hidden state with the present one in recurrent neural networks(RNNs),besides behaving as the information transition operator.Therefore,ARNet encourages the current hidden state to embed more information from the previous one and expolits the deeper relationships between them,which can help regularize the transition dynamics of recurrent neural networks.Extensive experimental results show that our proposed ARNet boosts the performance over the existing encoder-decoder models on image semantic captioning task.Additionally,we evaluate the discrepancy between training and inference processes for caption generation quantitatively and demonstrate that our ARNet remarkably reduces the discrepancy obviously.Furthermore,the performance on permuted sequential MNIST demonstrates that ARNet can effectively regularize RNN,especially on modeling long-term dependencies.

Keywords/Search Tags:

Image Semantic Captioning, Encoder-Decoder Framework, Convolutional Neural Networks, Recurrent Neural Networks

PDF Full Text Request

Related items

1	Image Captioning Based On Deep Recurrent Convlution Network And Spatio-temporal Information Fusion
2	High-level Semantics Based Cross-modality Applications
3	Research On Image Description Method Based On Multimodal Recurrent Neural Networks
4	Neural Networks Based Image Captioning Models For Obtaining Accurate Descriptions
5	The Design And Implementation Of An Automatic Image Captioning System Based On Deep Neural Networks
6	Research On Scene-based Image Semantic Description Generation Technology
7	Research On Image Captioning Algorithm Based On Deep Learning
8	Video Summarization Via Semantic Attended Networks
9	Research On Semantic Segmentation Of Road Scene Based On Deep Neural Networks
10	Deep Learning For Image Captioning