Automatically Generating Image Captioning Research

Posted on:2018-05-25

Degree:Master

Type:Thesis

Country:China

Candidate:Y F Shen

Full Text:PDF

GTID:2348330533961374

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Automatically describe an image using sentences-level captions has become a hot topic in recent years.Especially the depth study of deep learning greatly promoted the development of image captioning.Among these methods,the Long-Short Term Memory(LSTM)network is mostly used,it can not only restore long term memory and short term memory,but also can solve the problem of vanishing gradient and exploding gradient.Though some relate research has grained great performance in the field of image describing,some problems still exist and need to be improved: ? During training,how to train according to image describing in two direction,and learn more richer context information of image description.? During sampling,how to avoid only taking the predict value at time t-1 as the input value at time t,reduce cumulative error and avoid to lead wrong decisions finally.? How to generate quality text description with better model.To solve problems existing in the field of image describing,the paper presents a method of automatically generating image describing based bi-directional Long-Short Term Memory and scheduled sampling(BLSTM-S).The following is the main content: ?We propose a bi-directional Long-Short Term Memory network based on scheduled sampling.As we all know,when we do the work of choosing right words filling in the blank during English examination,the suitable word in the blank is not only relate to forward context of the sentence,but also the backward context of the sentence.So,Comparing to Long-Short Term Memory(LSTM),the BLSTM-S can not only learn the forward information of image describing but also the backward information of image describing,and learn to generate the description of image better.?Use the method of scheduled sampling to sample words.Comparing the previous method of each input of the model is the true value during training at every time,during scheduled sampling,we use a strategy of flipping a coin to randomly decide to use the true previous token with probability ?,or an estimate coming from the model itself with probability(1-?).It can avoid inconsistent between training stage and inference stage and avoid leading to cumulative bad decisions.?In order to get better results,during testing,we use beam search to search the max probability one among k candidates as the output value at every moment.Finally,to verity the validity and accuracy of the BLSTM-S model,in this paper,we conduct a mount of experiment in Flickr8 k data set,Flickr30 k data set and MSCOCO data set.The results showed that our methods outperform related methods on Flickr8 k,Flickr30k and MSCOCO data sets.

Keywords/Search Tags:

image describing, bi-directional Long-Short Term Memory, convolution neural network, scheduled sampling, stochastic gradient descent

PDF Full Text Request

Related items

1	Design Of A Blind Equalizer Based On Long Short-term Memory Neural Network
2	Research On Network Traffic Forecast Based On Neural Network
3	Research On Credit Risk Assessment Based On Long Short-term Memory Neural Network Model
4	Chinese Sign Language Recognition Based On Convolutional Network And Long Short Term Memory Network
5	Design And Implementation Of Network Intrusion Detection Algorithm Based On Convolutional Neural Network
6	Optimization Algorithms Of Neural Networks Weights Based On Stochastic Gradient Descent
7	Sentiment Analysis Of Short Text Based On Improved Bidirectional LSTM Neural Network
8	Research On Shared Bicycle Stock Prediction Based On Long-term And Short-term Memory Neural Network
9	Research And Application Of The Short-term Memory Network For Adjusting Gate Length
10	Research On Chinese Word Segmentation Method Based On Two-way Long And Short-term Memory Model