Font Size: a A A

Research On Automatic Image Annotation Based On Deep Learning

Posted on:2019-08-08Degree:MasterType:Thesis
Country:ChinaCandidate:Q ZhangFull Text:PDF
GTID:2428330563991565Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
The target of automatic image annotation is to annotate some rich and appropriate keywords for the untagged image that can accurately describe the content of the image.With the capability of describing images at the semantic level,image annotation has many applications not only in image analysis and understanding but also in some relative disciplines,such as urban management and biomedical engineering.The key of the AIA is to excavate the correlation between the low level visual features and the high-level semantics to narrow the “semantic gap”.With the rapid development of deep learning technology,people try to use deep framework of deep neural network and efficient cognitive ability to obtain more robust image features,and to dig deeper association between image features and semantics for automatic image annotation.This paper proposes an image annotation method based on deep learning,and designs and implements corresponding algorithm to verify the effectiveness of the model.On the one hand,on the basis of a large number of literature reading,this paper divides automatic image annotation methods into 5 major categories,describes and analyzes various annotation methods from the framework of the model,the main ideas,the starting point,the main focus and the complexity,And a detailed comparison is made between various annotation methods.Besides,some problems that still remain unsolved in the field of image automatic annotation are also discussed and analyzed in this paper.On the other hand,we propose a deep learning based image automatic annotation framework based on image nearest neighbor.On the one hand,the deep neural network is used to abstract the raw data as a stable feature representation for image annotation.On the other hand,the robust learning ability of deep learning is used to abstract the deeper connection between the image visual features and the high level semantics to better complete the automatic image annotation.Specifically,the research contents of the framework proposed in this paper include:First,in order to better represent the image,this paper proposes a robust image feature representation method combining visual features and semantic features.In this paper,we try to integrate the image visual features with effective image semantic features to get efficient image representation.Specifically,one is the use of CNN network to get image visual features.secondly,the candidate tag set is constructed based on the neighborhood of the image to be annotated,and then the semantic feature representation of the untagged image is obtained by the multi-layer perceptron network.Second,in order to further improve the performance of image automatic annotation,a tag quantity prediction module is introduced.In other words,considering the difference in the complexity of the content and scene between different images,it is no longer limited to annotate each image with a fixed number of tags,but based on the complexity of the content of the image itself,it can automatically predict the quantity of tags to complete the image annotation.Such a way of labeling is also more realistic.Finally,through the robust image features obtained,the multi target classification model and the label quantity prediction regression model are trained respectively,and the multi target classification results are combined with the quantity of predicted labels to automatically annotate the image.In order to verify the performance of the model proposed in this paper,For standard image sets NUS-WIDE tagged with 81 concepts or tagged with 1000 tags,Through the module functional verification experiment,the validity of each functional module(Candidate tags word embedding module and label quantity prediction module)proposed by this model is proved.The comparison of some classic models(CNN+softmax model?CNN+WARP model?CNN-RNN model?RIA model?SINN model and tag neighbor+tag vector model)in the field of automatic image annotation with deep learning techniques also proves that the annotation method proposed in this paper is valuable.
Keywords/Search Tags:Automatic image annotation, Semantic gap, Deep learning
PDF Full Text Request
Related items