Research On Image Captioning Method Based On Deep Neural Networks And Adaptive Attention Mechanism

Posted on:2021-04-28

Degree:Master

Type:Thesis

Country:China

Candidate:D L Liang

Full Text:PDF

GTID:2428330620969913

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Image captioning combines two fields of computer vision and natural language processing,which is a very challenging research task.The task aims to allow computers to automatically generate a descriptive text for an image.Compared with the traditional image captioning methods,neural network-based image captioning ones are more efficient and can generate more natural sentence descriptions for an image.This paper combines the deep neural networks and the attention mechanisms to develop an efficient image captioning algorithm.The main research work and contributions are as the follows:(1)An image captioning model based on long short-term adaptive attention is proposed.The traditional image captioning model based on attention mechanism usually combines the attention mechanism with long short-term memory networks and adjusts the attention of the model according to the hidden state of long short-term memory networks.However,due to the limited information stored in the hidden state,it is difficult for the model to locate the image region that has a high correlation with the current moment without sufficient information as a reference.In response to this problem,this paper proposes an image captioning model based on long short-term adaptive attention.This model uses the hidden state and memory unit state of the long short-term memory networks to guide the two attention modules respectively,and connects them through the adjustment factor,so that the model can refer to both information at the same time to infer which areas of the image should be paid attention to at the current moment.Through corresponding experiments and comparison with mainstream image captioning models,the validity of the proposed model is verified.(2)Based on our work introduced in this paper,considering that the weighted image features generated by the attention module will change at each moment while the word is under generating,if it is input into the long short-term memory networks together with the word vector,it is not good for the long short-term memory networks learning text sequences,so a method of using global features of images,instead of weighted image features to input to long short-term memory networks,is further proposed.The related experimental results show that the improved model can further improve the performance of the model.

Keywords/Search Tags:

convolutional neural network, long short-term memory networks, adaptive adjustment, attention mechanism

PDF Full Text Request

Related items

1	Research On Network Intrusion Detection Method Based On Bi-LSTM
2	Research On Relation Classification Via Bidirectional Long Short-Term Memory Networks With Attention Mechanism
3	Research On Image Caption Via Incorporating Attention And Long Short-Term Memory Network
4	Research On Chinese Event Extraction Via Incorporating Attention Mechanism And Long Short-Term Memory Networks
5	Research On Image Caption Based On Attention Mechanism
6	Chinese Sign Language Recognition Based On Convolutional Network And Long Short Term Memory Network
7	Research On Deep Learning Algorithm For Sequence Data
8	Group Activity Recognition Algorithm Research Based On Attention Mechanism And Deep Learning Network
9	Text Classification Research Based On Deep Neural Network And Attention Mechanism
10	Text Sentiment Classification Based On Attention Mechanism