Font Size: a A A

Automatic Image Annotation Based On Deep Learning With Robust Strategies

Posted on:2018-10-08Degree:MasterType:Thesis
Country:ChinaCandidate:M K ZhouFull Text:PDF
GTID:2428330542490613Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Automatic image annotation is a kind of technology that automatically adds key words which can describe image content to an unknown image.It is the key step of image retrieval and image understanding in the field of computer vision.This paper focus on the problems of traditional image annotation method.It is difficult to train unbalanced data,the single annotation model is difficult to improve the overall annotation effect and the efficiency of manual selection of features is low.Then a number of solutions are proposed in the aspects of model theory,annotation framework,visual features and so on.The main works of this paper include:(1)Traditional shallow machine learning algorithms lack generalization performance when dealing with complex classification problems.Automatic image annotation based on a non-linear-based stacked auto-encoder(NL-SAE)is proposed.For the problem of unbalanced data,this paper proposes a non-linear-based balanced and stacked auto-encoder(NL-BSAE)that can enhance training for middle-and low-frequency tags.On the basis of this model,a non-linear-based robust BSAE(NL-RBSAE)algorithm which enhances training for sub NL-BSAE model by group is proposed to enhance the annotation stability.This strategy ensures that the model itself has a strong ability to deal with the problem of unbalanced data.Moreover,this paper first takes an unknown image.Then an annotation framework discriminating high-and low-frequency attribute of the image is constructed based on the NL-RBSAE algorithm.This strategy ensures that the annotation process has a strong ability to deal with the problem of unbalanced data.(2)The convergence speed of the non-linear-based SAE is slow,and it is not suitable for training datasets of small and medium scales.Then automatic image annotation based on a linear-based stacked auto-encoder(L-SAE)is proposed.This paper first proposes a linear-based balanced and stacked auto-encoder(L-BSAE)that can enhance training for middle-and low-frequency tags.On the basis of this model,a linear-based robust BSAE(L-RBSAE)algorithm which enhances training for sub L-BSAE model by group is proposed.Finally,the attribute discrimination annotation framework based on L-RBSAE(L-ADA)is constructed to improve the speed of the non-linear-based SAE model training in small and medium datasets.(3)There are two problems in traditional image annotation methods.One is that the manual selection of features is time-consuming and laborious.The other is that the traditional label propagation algorithm ignores semantic neighbors,which results in that the vision is similar but the semantics is dissimilar and influences the annotation results.An automatic image annotation based on semantic neighbors combined with deep features is proposed to solve these problems.This method firstly constructs a unified and adaptive deep feature extraction framework based on deep convolutional neural network(CNN).Then,the training dataset is divided into semantic groups and the neighborhood images of the unannotated images are set up.Finally,according to the visual distance,the contribution value of each label of the neighborhood images is calculated and the key words are obtained by sorting their contribution values.This method can realize the adaptive feature extraction and improve the annotation effect.Experimental results on three benchmark datasets show that:The method proposed in this paper can effectively improve the effect of training unbalanced data,and improve the annotation efficiency of low-and middle-frequency tags.Compared to the traditional manual features,the proposed deep feature has lower dimensions and better effect.To sum up,the methods proposed in this paper effectively improve the accuracy and the number of accurate predicted tags.
Keywords/Search Tags:stacked auto-encoder, deep learning, image annotation, semantic neighbor, convolutional neural network
PDF Full Text Request
Related items