Font Size: a A A

Research On Image Annotation Algorithm Based On Convolutional Neural Network

Posted on:2020-05-02Degree:MasterType:Thesis
Country:ChinaCandidate:X J WuFull Text:PDF
GTID:2428330578477961Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the era of big data,a large number of pictures are uploaded to the Internet everyday.In order to effectively manage and retrieve such large-scale image data,efficient and automatic image annotation is becoming more and more important.Automatic image annotation is to use the algorithm to enable the computer to automatically assign keywords to the image whose semantic content is related to keywords.Its essence is to establish a mapping relationship between the high-level semantic information and the low-level features of the image.Traditional image annotation algorithms require manual extraction of features and are not suitable for large-scale data sets.Image annotation algorithms based on deep learning mostly ignore the multi-label nature of images,and these algorithms do not consider the correlation between image labels,so that the extracted features are not efficient enough.Based on this,this thesis deeply explores the theory and method of automatic annotation based on convolutional neural network,and proposes improved algorithms and models for the existing problems.The contributions of this thesis are summarized as follows.(1)We propose a convolutional neural network image annotation algorithm based on Sigmoid loss function.For multi-label nature of image annotation tasks,we change the Softmax loss function commonly used in convolutional neural networks to Sigmoid function,and propose a convolutional neural network model for multi-label images.We also propose to use 256-bit coding features to represent images for image annotation,so that we obtain features which can be conveniently stored and fast compared.(2)We propose a convolutional neural network image annotation algorithm based on multi-label weighted triplet loss function.In view of the correlation between image tags and the similarity of similar images,we introduce a triplet loss function,which increases the cohesion of the model.At the same time,in order to solve the problem of Hamming error,we propose a multi-label weighted triplet loss function(MWTL).And we propose a convolutional neural network image annotation model which combine the Inception V4 network model and MWTL.(3)We propose a convolutional neural network image annotation algorithm based on Spatial SE features.We improve the structure of the convolutional neural network for the problem of insufficient multi-label global feature representation.The Squeeze-and-Excitation(SE)module which considers the network channel level is introduced in network.It learns the importance of each feature channel automatically by learning.And through this importance,the network can use the global information to selectively enhance the beneficial feature channel and suppress the useless feature channel.So that the feature channel adaptive calibration can be realized.On this basis,we also consider the importance of each spatial pixel,introduce spatial pixel weight information for feature,and extract the robust and efficient Spatial SE features.Considering the loss function and network structure of the model,we propose three kinds of convolutional neural network image annotation algorithms.By comparing with the existing image annotation algorithms on multiple image datasets such as Natural Scenes,Corel-5K,ESP-Game,IAPRTC-12 and NUS-WIDE,we verify the proposed algorithms on the image annotation task.
Keywords/Search Tags:Image Annotation, Convolutional Neural Network, Multi-label, Weighted Triplets Loss, Squeeze-and-Excitation Module
PDF Full Text Request
Related items