Font Size: a A A

Image Retrieval And Annotation Based On Deep Learning And Visual Mechanism

Posted on:2017-01-18Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2428330569998757Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,more and more digital images and photos spread on the Internet.Image has become an essential part of our life.Image feature representation is the key to the description of the image concent.The quality of the feature representation directly determines the results of retrieval tagging.Compared with the artificial feature,convolutional neural network has the absolute advantage in image feature representation.The visual attention mechanism enables the neural network to automatically learn to focus the image region to be expressed by assigning different weights to the different parts of the image.In this paper,the research of image retrieval and image annotation based on deep learning and visual attention mechanism is studied.The main work includes the following aspects:An image retrieval method based on visual attention mechanism and recurrent neural network is proposed in this paper.The method first uses CNN to extract the low-level features,and then produces the correlation of local regions by middle layer LSTM,and finally generates a set of vector description of an image by the LSTM based on visual attention mechanism.The image retrieval task is completed by the Hungarian algorithm to calculate the similarity between images.The experimental results show that the proposed method can significantly improve the retrieval performance on the multi-label data set with respect to methods of fixed-length image description.Also,the influence of the middle layer LSTM layer on the retrieval accuracy is further analyzed,and the retrieval performance is evaluated.A variable length image annotation method based on recurrent neural network is proposed in this paper.The method first extracts the low-level image features with CNN and uses low-level image features to initialize hidden network state of LSTM,then embeds tag sequence information into the LSTM network,finally forecast a variable length sequence tag after being trained.Experimental results show that the proposed method can significantly improve the labeling effect with respect to the fixed length image annotation.A method of image annotation with variable length based on visual attention mechanism is proposed in this paper.The method gives the input sequence of different items with different weights by introducing visual attention mechanism.Experiments show that,compared with variable length image annotation method based on the recurrent neural network,this method improved the effect of image annotation.Also,this paper analyzes the role of visual attention mechanism in the task of annotation and the difficulties in the research of visual attention mechanism.
Keywords/Search Tags:Deep Learning, CNN, RNN, LSTM, Visual Attention Mechanism, Image Retrieval, Image Annotation
PDF Full Text Request
Related items