Font Size: a A A

Image Description Method Based On Convolution Recurrent Integrated Models

Posted on:2018-03-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y DingFull Text:PDF
GTID:2348330563952185Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Image description is an important issue in the field of artificial intelligence.It has wide applications in the fields of image recognition,image detection,unmanned,and blind navigation,etc.By transforming the problem of image processing into the text language processing,image description is conducive to a better understanding of the visual scene.Given an image,how to generate accurate and comprehensive description sentences which is accordance to the observation of human eyes is the key problem.Image description requires both the knowledge of image processing and natural language processing.Such complex knowledge background makes the image description a very difficult and challenging task.But the broad application prospects attract a lot of scholars.In this paper,we studied the problem of image description,the main contents are as follows.1.To solve the problem of automatic image description,his paper studied the general image description model integrating the convolutional and recurrent neural networks.In the image description process,we first represent the image and the text in a high-dimensional space,and then establish their matching relationship in the space.The whole model is consists of three parts.The first one is image feature extraction and encoding based on the convolution neural network.The second one is sentence coding.The most straightforward way is to map a sentence directly into high-dimensional space.The third step is to establish a matching relationship in high-dimensional space and generate a sentence description based on long short term memory neural network.2.To solve the problem of word vector initialization in the convolutional recurrent integrating model,we studied the sentence coding process.In the sentence coding stage,we introduced the word2 vec for training sentence word vector.The word2 vec is a kind of neural network.Comparing to the random word vector generating method,the word vector obtained through the word2 vec can reflect the correlation among each word,which is benefit for improving the quality of the generated sentences and the generalization ability of the model.3.To solve the problem of input vector preprocessing in the part of description sentence generation,we studied the sentence generation model.We proposes a long and short memory network model with ordinary hidden layer,which is different from the traditional long short-term memory network model.In this model,the word vector first passes through a common hidden layer,and then enters the cell unit of the memory network for length and length,participating in the loop operation.This change can pre-process the training data well.The learning algorithm used in this model is similar to the long short-term memory network.The new layer of the model is equivalent to the input of the previous network.From the new hidden layer to the input,the normal back propagation algorithm can be used for parameter learning.By experimenting with the Flickr8 K data set,the results show that the introduction of a new layer of common hidden layer or the word2 vec coding in the original long short term memory network can improve the accuracy of image description,and generating more accurate and proper image description sentences.
Keywords/Search Tags:Image Description, Deep Learning, Neural Network, Word2vec, Language Model
PDF Full Text Request
Related items