Design And Implementation Of Image Captioning Model Based On Deep Learning

Posted on:2019-01-08

Degree:Master

Type:Thesis

Country:China

Candidate:Y Liu

Full Text:PDF

GTID:2428330566497304

Subject:Software engineering

Abstract/Summary:

Image Captioning is a hot research issue in the area of Deep Learning that connects computer vision and natural language processing.Nowadays,The main focus of Image Captioning models is how to design a more effective visual attention mechanism,so that the model can extract and use the image features better in the process of generating the captions of the image.But most traditional approaches tend to adopt regular the language structure patterns.That is to say,they tend to fall into a stereotype of replicating frequent words or phrases in dataset,and can not make the model generate more rich and more varied image captions based on some unique characteristics of the image.This paper holds that the main reason of the issues discussed above is that the traditional models generally use the LSTM to generate the description of the image,which leads to the failure of the model to learn and use the syntactic features of the natural sentences.Therefore,this paper presents an Image Captioning model based on self-attention mechanism and spatial attention mechanism,which adopt popular Encoder-Decoder framework in design.In the Encoder module,Convolution Neural Network is applied to extract image features,and the Decoder of model consists of a number of sub modules instead of traditional LSTM,which are stacked by the multi-head spatial attention sublayer,the multi-head self-attention sublayer and the full-connection feedforward network sublayer.Among them,The multi-head spatial attention sublayer uses the spatial attention mechanism to select and utilize the image features and the multi-head self-attention sublayer based on self-attention mechanism can capture the syntactic or grammatical features in the natural sentence.After proposing and designing a new Image Captioning model,This paper introduces the specific implementation of each module of the model.In addition,this paper also introduces the test and evaluation of the model on the MSCOCO datasets,and the result shows its superior performance than the model only based on various os visual attention mechanisms.

Keywords/Search Tags:

image caption, convolutional neural network, spatial attention, self-attention

Related items

1	Research On Image Caption Algorithm Based On Attention Mechanism
2	Research On Image Caption Method Based On Attention Feedback Mechanism
3	Image Chinese Caption Generation Based On Visual Attention And Topic Model
4	Image Caption Generation With Region Based Attention Scheme
5	Research On Image Caption Based On Dual LSTM
6	Image Caption Model Based On Feature Extraction Via Dense Convolutional Neural Network
7	Image Caption Algorithm Based On Graph Convolution Networks And Attention Mechanism
8	Research On Image Caption Based On Attention Mechanism
9	Research On Image Caption Generation Method Based On Deep Learning
10	Study On Image Captioning Based On Spatial Topological Relationship