Font Size: a A A

Research On Image Annotation Based On Generative Adversarial Network

Posted on:2020-05-30Degree:MasterType:Thesis
Country:ChinaCandidate:L C ShuiFull Text:PDF
GTID:2428330599459679Subject:Microelectronics and Solid State Electronics
Abstract/Summary:PDF Full Text Request
With the development of Internet technology and the rapid popularization of smart devices,the hundreds millions of image data were generated every day and uploaded to the Internet,most of these image data were cluttered and contained a lot of information.In order to manage these image data and use the information,the image automatic annotation technology was proposed.At present,most of the image automatic annotation technology build the annotation model through traditional machine learning or deep learning methods to generate labels of unknown image.However,most of these annotation methods have a problem that the number of neurons(classifiers)in the output layer is directly proportionate to the label vocabulary,which will lead to two problems: 1.When the label vocabulary is very large,the annotation model will be less practical.Because the huge output layer will increase the difficult of the model design and training;2.The structure stability of the annotation model is poor,the model structure will change with the label vocabulary.In order to solve the problem,a new annotation model combining the Generative Adversarial Network(GAN)and Word2 vec is designed and implemented.First,the label is mapped to a fixed and optional multidimensional word vector by the Word2 vec model.Secondly,a neural network model(GAN-W)is built using the GAN and the neurons number of the model output layer was equal to the dimensions of the word vector.So,the model generator will generate a vector with the same dimension of the word vector,no longer relate to the label vocabulary.Finally,by sorting the output of the model,the final label of image is determined.Experiments are conducted on the Corel 5K and IAPR TC-12 image annotation data set,the results show that: 1.The experiment of the vector dimension influence on the annotation performance proves that the model can solve the above-mentioned problem and the neurons number of the output layer can be freely selected in a wide range.2.Comparing with other classical model performance,it is shown the accuracy R and F1 values are higher than other classical model,at the same time,the recall rate R is second only to theCNN-MLSU model,the annotation performance of the model has a large improvement.3.The actual label results of the model show it is self-adaptive to the number of label in each image,which is more suitable for actual annotation situation.In summary,the model proposed in this paper can solve the problem that the neurons number of the output layer is directly proportionate to the label vocabulary,meanwhile the model performance has a improvement compared with other classical model,and it also has advantages in the actual annotation situation..
Keywords/Search Tags:Automatic image annotation, Deep learning, Generative adversarial network, Label vectorization, Transfer learning
PDF Full Text Request
Related items