Font Size: a A A

Research On Image Description For Complex Scenes

Posted on:2021-04-26Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y LvFull Text:PDF
GTID:2428330629488202Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
In recent years,the application of the network structure represented by M-RNN in the Image Captioning field has shown good results.In the process of using the neural network model to achieve this task,from the image feature extraction of the encoder to the decoder,choose to focus on image information or text information to generate a text description.These series of algorithms will affect the performance of the model.In previous studies using deep learning models,some researchers used deep convolutional neural networks and multi-layer bidirectional long-short-term memory networks to build models,and some tried to apply attention mechanisms to image space features,but no matter which None of them explicitly focus on the prediction of abstract concepts.This article builds a deep learning network architecture in the form of a multi-model to achieve Image Captioning tasks.Through the Encoder-Decoder framework,different CNNs are used to extract image features,pre-trained Glove is used as word embedding combined with LSTM,and improved attention as well as mechanism is added to the encoder as the final network structure.The prediction on the test set uses Beam Search to confirm the final prediction to get the picture description text,and uses BLEU,METEOR and CIDEr to evaluate the model performance.Our model is able to dynamically process visual and non-visual information at each time step in generating an image description,so it can produce better predictions for abstract concepts.This is also the feature of this article,that is,in the decoder part,the model can automatically choose whether to use visual information or a language model,thereby generating a high-quality image description.
Keywords/Search Tags:Image Captioning, Encoder-Decoder, Attention mechanism
PDF Full Text Request
Related items